Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Incremental Joint Calling Produces Small VCF

robcrobc irelandMember
Hi!

I'm trying to run incremental joint calling on 154 samples from gene panel data.
I've been following the best practices guidelines but the consolidated genotyped VCF is very small. The final VCF consists of 3,390 variants, 3 - 10 times smaller than one of the single g.VCF files.

# Generate gVCFs
java -jar gatk-3.5/GenomeAnalysisTK.jar -T HaplotypeCaller -R GrCh38.fasta -I sample1.recal.bam --dbsnp dbSNP.vcf --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -minPruning 3 -o sample1.g.vcf
.. ...
java -jar gatk-3.5/GenomeAnalysisTK.jar -T HaplotypeCaller -R GrCh38.fasta -I sample154.recal.bam --dbsnp dbSNP.vcf --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -minPruning 3 -o sample154.g.vcf

# Consolidate gVCFs using GenomicsDatabaseImport
gatk-4.1.0.0/gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport -V sample1.g.vcf -V sample2.g.vcf ... -V sample154.g.vcf -L capture_targets.bed --genomicsdb-workspace-path /GDBI --tmp-dir=/large_tmp

# Genotype gVCF
gatk-4.1.0.0/gatk --java-options "-Xmx4g" GenotypeGVCFs -R GrCh38.fasta -V gendb://GDBI -O joint_called_154.vcf

# Example of variant output from the joint_called_154.vcf file:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10008_S14 10018_S24 10020_S71 10021_S21 10022_S13 10023_S45 10025_S28 10030_S93 10139_S69 10141_S96 10142_S38 10200_S65 10201_S64 10202_S92 10205_S27 10208_S79 10212_S95 10213_S18 10214_S91 10217_S3 10218_S78 10219_S66 10221_S49 10225_S58 10235_S2 10236_S6 10237_S25 10241_S55 10244_S41 10245_S46 10246_S26 10247_S52 10249_S7 10250_S76 10252_S30 10257_S48 10262_S91 10266_S85 10267_S15 10268_S6 10280_S5 10286_S12 10290_S1 10296_S77 10300_S92 10308_S59 10309_S33 10311_S70 10315_S94 10326_S84 10336_S11 10337_S11 10338_S76 10340_S58 10341_S70 10345_S95 10358_S40 10359_S32 10371_S63 10373_S13 10377_S82 10378_S42 10379_S41 10381_S40 10392_S62 10393_S43 10430_S86 10440_S27 10443_S88 10444_S25 10446_S73 10448_S46 10467_S50 10471_S78 10472_S6 10473_S24 10474_S47 10477_S89 10478_S87 10481_S7 10495_S5 10626_S68 10627_S31 10628_S4 10629_S23 10630_S83 10631_S20 10632_S17 10634_S89 10635_S75 10636_S57 10637_S74 10638_S88 10639_S87 10640_S56 10641_S51 10643_S73 10644_S19 10646_S72 10647_S38 10648_S81 10650_S55 10651_S8 10652_S44 10653_S37 10654_S36 10655_S30 10656_S35 10657_S67 10658_S47 10659_S16 10660_S29 10661_S80 10662_S54 10663_S53 10664_S22 10665_S34 10666_S43 10667_S50 10668_S42 10669_S86 10684_S90 10685_S17 10686_S10 10921_S77 10961_S87 28115_S96 8123_S63 8126_S62 8127_S4 8128_S61 8137_S64 8141_S69 8142_S52 8143_S60 8146_S56 8147_S31 8151_S96 8157_S12 8162_S81 8166_S61 8167_S9 8169_S53 8174_S84 8175_S66 8178_S34 8179_S57 8181_S32 8182_S15 8192_S77 8193_S75 8194_S21 8196_S2 8200_S59
chr1 5864458 . G A 25816.98 . AC=14;AF=0.045;AN=308;BaseQRankSum=-1.466e+00;ClippingRankSum=0.469;DP=7979;ExcessHet=0.6928;FS=0.000;InbreedingCoeff=0.1020;MLEAC=14;MLEAF=0.045;MQ=60.00;MQRankSum=-1.550e-01;QD=9.24;RAW_MQ=10101600.00;ReadPosRankSum=1.23;SOR=0.731 GT:AD:DP:GQ:PL 0/0:36,0:36:99:0,105,1114 0/1:107,131:238:99:2561,0,2124 0/0:34,0:34:99:0,99,1111 0/0:40,0:40:99:0,111,1285 0/0:40,0:40:99:0,102,1329 0/0:47,0:47:99:0,105,1574 0/1:120,118:238:99:2197,0,2358 0/0:38,0:38:99:0,99,1485 0/0:38,0:38:99:0,99,1485 0/0:34,0:34:99:0,99,1031 0/1:149,97:246:99:1557,0,3048 0/0:37,0:37:99:0,99,1086 0/0:34,0:34:99:0,99,1200 0/0:44,0:44:99:0,99,1485 0/0:34,0:34:99:0,99,1057 0/0:35,0:35:99:0,102,1087 0/0:41,0:41:99:0,102,1530 0/0:34,0:34:99:0,102,1046 0/0:34,0:34:99:0,102,1082 0/0:34,0:34:99:0,99,1201 0/0:38,0:38:99:0,99,1140 0/0:35,0:35:99:0,99,1329 0/1:65,50:115:99:928,0,1572 0/0:33,0:33:99:0,99,1113 0/0:34,0:34:99:0,99,1179 0/0:36,0:36:99:0,99,1058 0/0:33,0:33:99:0,99,1050 0/0:45,0:45:99:0,99,1485 0/0:37,0:37:99:0,111,1140 0/0:34,0:34:99:0,99,1485 0/0:33,0:33:99:0,99,910 0/0:38,0:38:99:0,99,1394 0/0:41,0:41:99:0,120,1800 0/0:33,0:33:99:0,99,1087 0/0:36,0:36:99:0,99,1009 0/0:38,0:38:99:0,99,1065 0/0:35,0:35:99:0,100,952 0/0:34,0:34:99:0,99,890 0/0:40,0:40:99:0,100,1204 0/0:39,0:39:99:0,101,1444 0/0:39,0:39:99:0,99,1485 0/0:39,0:39:99:0,102,1397 0/0:38,0:38:99:0,99,1485 0/0:47,0:47:99:0,114,1710 0/0:33,0:33:99:0,99,1072 0/0:42,0:42:99:0,99,1485 0/0:44,0:44:99:0,105,1575 0/1:108,108:216:99:1895,0,2106 0/0:38,0:38:99:0,108,1620 0/0:34,0:34:99:0,102,1061 0/0:35,0:35:99:0,99,977 0/0:39,0:39:99:0,102,1262 0/0:34,0:34:99:0,102,933 0/1:122,81:203:99:1295,0,2667 0/0:34,0:34:99:0,99,1079 0/0:35,0:35:99:0,99,1107 0/1:125,126:251:99:2283,0,2591 0/0:41,0:41:99:0,99,1360 0/0:36,0:36:99:0,99,1128 0/0:41,0:41:99:0,102,1128 0/0:34,0:34:99:0,99,1157 0/0:35,0:35:99:0,99,1222 0/0:39,0:39:99:0,102,1251 0/0:35,0:35:99:0,99,1023 0/0:35,0:35:99:0,100,1056 0/0:52,0:52:99:0,120,1800 1/1:0,161:161:99:3893,468,0 0/0:37,0:37:99:0,99,1482 0/0:35,0:35:99:0,99,1106 0/0:49,0:49:99:0,109,1496 0/0:34,0:34:99:0,99,1171 0/0:33,0:33:99:0,99,930 0/0:38,0:38:99:0,99,1104 0/0:34,0:34:99:0,99,1173 0/0:35,0:35:99:0,102,1270 0/0:37,0:37:99:0,99,1295 0/0:45,0:45:99:0,99,1485 0/0:36,0:36:99:0,102,1530 0/0:36,0:36:99:0,99,1485 0/0:34,0:34:99:0,99,1032 0/0:34,0:34:99:0,99,1221 0/1:73,52:125:99:923,0,1604 0/0:36,0:36:99:0,108,1152 0/0:35,0:35:99:0,99,1129 0/0:35,0:35:99:0,102,1008 0/0:37,0:37:99:0,102,1145 0/0:33,0:33:99:0,99,1123 0/0:38,0:38:99:0,114,1122 0/0:36,0:36:99:0,99,1014 0/0:39,0:39:99:0,99,1485 0/0:34,0:34:99:0,99,1127 0/0:42,0:42:99:0,108,1620 0/0:34,0:34:99:0,99,1228 0/0:35,0:35:99:0,99,1074 0/0:45,0:45:99:0,104,1620 0/0:43,0:43:99:0,99,1135 0/0:33,0:33:99:0,99,1077 0/1:156,136:292:99:2565,0,3381 0/0:34,0:34:99:0,99,1057 0/0:33,0:33:99:0,99,1126 0/0:34,0:34:99:0,99,1169 0/0:35,0:35:99:0,99,1193 0/0:39,0:39:99:0,99,1485 0/0:33,0:33:99:0,99,1056 0/0:35,0:35:99:0,99,1131 0/0:33,0:33:99:0,99,1018 0/0:39,0:39:99:0,105,1233 0/0:41,0:41:99:0,102,1530 0/0:35,0:35:99:0,99,1162 0/0:34,0:34:99:0,102,941 0/0:39,0:39:99:0,99,1412 0/0:33,0:33:99:0,99,1081 0/0:36,0:36:99:0,105,1220 0/0:35,0:35:99:0,102,1530 0/0:38,0:38:99:0,105,1182 0/0:33,0:33:99:0,99,1138 0/0:34,0:34:99:0,99,1004 0/0:40,0:40:99:0,102,1285 0/0:38,0:38:99:0,99,1331 0/0:38,0:38:99:0,108,1505 0/0:38,0:38:99:0,102,1423 0/0:34,0:34:99:0,99,1485 0/0:36,0:36:99:0,102,1147 0/0:39,0:39:99:0,99,1431 0/0:36,0:36:99:0,99,1044 0/0:43,0:43:99:0,99,1464 0/0:33,0:33:99:0,99,1184 0/1:103,54:157:99:1028,0,2387 0/0:37,0:37:99:0,99,1485 0/1:125,120:245:99:2493,0,2749 0/0:33,0:33:99:0,99,1048 0/0:33,0:33:99:0,99,1111 0/0:34,0:34:99:0,99,1358 0/0:35,0:35:99:0,102,1530 0/0:34,0:34:99:0,102,1047 0/0:34,0:34:99:0,99,1047 0/0:34,0:34:99:0,99,1183 0/0:40,0:40:99:0,112,1301 0/0:34,0:34:99:0,102,989 0/0:35,0:35:99:0,99,1228 0/0:39,0:39:99:0,102,1210 0/0:43,0:43:99:0,99,1472 0/1:164,142:306:99:2350,0,3214 0/0:35,0:35:99:0,99,1197 0/0:33,0:33:99:0,99,1054 0/0:34,0:34:99:0,99,1231 0/0:33,0:33:99:0,99,1012 0/0:38,0:38:99:0,102,979 0/0:33,0:33:99:0,99,1077 0/0:34,0:34:99:0,102,1015 0/0:33,0:33:99:0,99,1040 0/0:36,0:36:99:0,108,1204 0/0:37,0:37:99:0,105,1060 0/0:37,0:37:99:0,99,1212


I'm trying to figure out where it is going wrong. Could it be because the gVCFs and incremental joint calling steps use different versions of the GATK and are incompatible? Or because the gVCFs were called without using an interval list while an interval list was specified for the joint calling steps?

Thanks!

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @robc

    As we do not support GATK3 anymore I am not entirely sure if this is a compatibility issue. The best way to find out is to up use GATK4 all the way.

    Or because the gVCFs were called without using an interval list while an interval list was specified for the joint calling steps?

    Yes this is entirely possible.

  • TiaTia Member
    hi bhanuGandham,
    we recently run GATK v4.1.2.0 germline calling pipeline using 12 samples from one individual. And get similar results. The joint genotyped vcf with an interval list only include 72444 variants . After VQSR ,we obtain 9790 PASS INDEL and 48227 PASS SNP. Is it normal ? Really thankful if you can reply .
Sign In or Register to comment.