GenotypeGVCFs AD=0

micknudsenmicknudsen DenmarkMember


I have a sample for which HaplotypeCaller identifies the following variant:

chr12   133237753   .   GAAA    G,GA,GAA,TAAA,GAAAA,GAAAAA,<NON_REF>    66.73   .   BaseQRankSum=-1.481;ClippingRankSum=0.000;DP=407;ExcessHet=3.0103;MLEAC=0,0,0,0,0,0,1;MLEAF=0.00,0.00,0.00,0.00,0.00,0.00,0.500;MQRankSum=-0.936;RAW_MQ=1473600.00;ReadPosRankSum=-0.376    GT:AD:DP:GQ:PL:SB   0/7:133,16,16,47,24,15,6,0:257:3:102,422,10788,292,8507,8145,3,4923,4768,4515,709,4921,4093,2553,6007,405,4861,4378,3143,3552,4849,564,6844,6021,4099,4376,5290,7241,0,2341,2211,1690,2151,2016,2301,1682:67,66,34,90

After running GenotypeGVCFs using all our previously analyzed samples, and then using SelectVariants (with options excludeNonVariants and removeUnusedAlternates) to grab the relevant sample, I get the following:

chr12   133237753   .   GA  G   381589.02   .   AC=1;AF=0.500;AN=2;BaseQRankSum=0.165;ClippingRankSum=-4.100e-02;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-2.7975;MQ=60.16;MQRankSum=0.100;QD=1.67;ReadPosRankSum=0.265;SOR=0.630  GT:AD:DP:GQ:PL  0/1:133,47:257:99:99,0,4512
chr12   133237754   .   A   T   361388.42   .   AC=1;AF=0.500;AN=2;BaseQRankSum=-1.733e+00;ClippingRankSum=-5.750e-01;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-0.7161;MQ=1.45;MQRankSum=0.00;QD=1.97;ReadPosRankSum=1.07;SOR=0.629    GT:AD:DP:GQ:PL  0/1:133,0:257:3:102,0,1682
chr12   133237755   .   A   T   351628.83   .   AC=1;AF=0.500;AN=2;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-0.5329;MQ=0.62;QD=1.94;SOR=0.629  GT:AD:DP:GQ:PL  0/1:133,0:257:3:102,0,1682

We now all of a sudden see A->T variants at the two positions downstream the original variant (not called by HaplotypeCaller), but in both cases AD for the alternate allele is 0.

Is this the intended behavior? Part of the GenotypeGVCFs documentation says "This tool performs the multi-sample joint aggregation step and merges the records together in a sophisticated manner". Maybe I just don't understand that level of sophistication :-)


Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hmm, the two extra calls themselves might be caused by what's in other samples, but the ADs don't make much sense. What version are you using? Does this reproduce with the latest nightly?

  • micknudsenmicknudsen DenmarkMember

    I am using GATK 3.7. Will try latest nightly build and get back when I have results.

  • micknudsenmicknudsen DenmarkMember

    Unfortunately, the output is the same for the nightly build (2017-06-26). If this is a bug, is there any way we can assist in tracking it down?

