Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!
I have a sample for which HaplotypeCaller identifies the following variant:
chr12 133237753 . GAAA G,GA,GAA,TAAA,GAAAA,GAAAAA,<NON_REF> 66.73 . BaseQRankSum=-1.481;ClippingRankSum=0.000;DP=407;ExcessHet=3.0103;MLEAC=0,0,0,0,0,0,1;MLEAF=0.00,0.00,0.00,0.00,0.00,0.00,0.500;MQRankSum=-0.936;RAW_MQ=1473600.00;ReadPosRankSum=-0.376 GT:AD:DP:GQ:PL:SB 0/7:133,16,16,47,24,15,6,0:257:3:102,422,10788,292,8507,8145,3,4923,4768,4515,709,4921,4093,2553,6007,405,4861,4378,3143,3552,4849,564,6844,6021,4099,4376,5290,7241,0,2341,2211,1690,2151,2016,2301,1682:67,66,34,90
After running GenotypeGVCFs using all our previously analyzed samples, and then using SelectVariants (with options excludeNonVariants and removeUnusedAlternates) to grab the relevant sample, I get the following:
chr12 133237753 . GA G 381589.02 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.165;ClippingRankSum=-4.100e-02;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-2.7975;MQ=60.16;MQRankSum=0.100;QD=1.67;ReadPosRankSum=0.265;SOR=0.630 GT:AD:DP:GQ:PL 0/1:133,47:257:99:99,0,4512 chr12 133237754 . A T 361388.42 . AC=1;AF=0.500;AN=2;BaseQRankSum=-1.733e+00;ClippingRankSum=-5.750e-01;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-0.7161;MQ=1.45;MQRankSum=0.00;QD=1.97;ReadPosRankSum=1.07;SOR=0.629 GT:AD:DP:GQ:PL 0/1:133,0:257:3:102,0,1682 chr12 133237755 . A T 351628.83 . AC=1;AF=0.500;AN=2;DP=257;ExcessHet=2147483647.0000;FS=0.000;InbreedingCoeff=-0.5329;MQ=0.62;QD=1.94;SOR=0.629 GT:AD:DP:GQ:PL 0/1:133,0:257:3:102,0,1682
We now all of a sudden see A->T variants at the two positions downstream the original variant (not called by HaplotypeCaller), but in both cases AD for the alternate allele is 0.
Is this the intended behavior? Part of the GenotypeGVCFs documentation says "This tool performs the multi-sample joint aggregation step and merges the records together in a sophisticated manner". Maybe I just don't understand that level of sophistication :-)