To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

GenotypeGVCFs AD=0

Hi!

I have a sample for which HaplotypeCaller identifies the following variant:

chr12   133237753   .   GAAA    G,GA,GAA,TAAA,GAAAA,GAAAAA,<NON_REF>    66.73   .   BaseQRankSum=-1.481;ClippingRankSum=0.000;DP=407;ExcessHet=3.0103;MLEAC=0,0,0,0,0,0,1;MLEAF=0.00,0.00,0.00,0.00,0.00,0.00,0.500;MQRankSum=-0.936;RAW_MQ=1473600.00;ReadPosRankSum=-0.376    GT:AD:DP:GQ:PL:SB   0/7:133,16,16,47,24,15,6,0:257:3:102,422,10788,292,8507,8145,3,4923,4768,4515,709,4921,4093,2553,6007,405,4861,4378,3143,3552,4849,564,6844,6021,4099,4376,5290,7241,0,2341,2211,1690,2151,2016,2301,1682:67,66,34,90

After running GenotypeGVCFs using all our previously analyzed samples, and then using SelectVariants (with options excludeNonVariants and removeUnusedAlternates) to grab the relevant sample, I get the following:

chr12   133237753   .   GA  G   381589.02   .   AC=1;AF=0.500;AN=2;BaseQRankSum=0.165;ClippingRankSum=-4.100e-02;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-2.7975;MQ=60.16;MQRankSum=0.100;QD=1.67;ReadPosRankSum=0.265;SOR=0.630  GT:AD:DP:GQ:PL  0/1:133,47:257:99:99,0,4512
chr12   133237754   .   A   T   361388.42   .   AC=1;AF=0.500;AN=2;BaseQRankSum=-1.733e+00;ClippingRankSum=-5.750e-01;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-0.7161;MQ=1.45;MQRankSum=0.00;QD=1.97;ReadPosRankSum=1.07;SOR=0.629    GT:AD:DP:GQ:PL  0/1:133,0:257:3:102,0,1682
chr12   133237755   .   A   T   351628.83   .   AC=1;AF=0.500;AN=2;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-0.5329;MQ=0.62;QD=1.94;SOR=0.629  GT:AD:DP:GQ:PL  0/1:133,0:257:3:102,0,1682

We now all of a sudden see A->T variants at the two positions downstream the original variant (not called by HaplotypeCaller), but in both cases AD for the alternate allele is 0.

Is this the intended behavior? Part of the GenotypeGVCFs documentation says "This tool performs the multi-sample joint aggregation step and merges the records together in a sophisticated manner". Maybe I just don't understand that level of sophistication :-)

Thanks!

Best Answer

Answers

Sign In or Register to comment.