This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Incorrect AD values in HC-called vcf and combined gvcf
I am using GATK v3.2.2 following the recommended practices (...HC -> CombineGVCFs -> GenotypeGVCFs ...) and while looking through suspicious variants I came across a few hetz with AD=X,0. Tracing them back I found two inconsistencies (bugs?);
1) Reordering of genotypes when combining gvcfs while the AD values are kept intact, which leads to an erronous AD for a heterozygous call. Also, I find it hard to understand why the 1bp insertion is emitted in the gvcf - there is no reads supporting it:
single sample gvcf
1 26707944 . A AG,G,<NON_REF> 903.73 . [INFO] GT:AD:DP:GQ:PL:SB 0/2:66,0,36,0:102:99:1057,1039,4115,0,2052,1856,941,3051,1925,2847:51,15,27,9
1 26707944 . A G,AG,<NON_REF> . . [INFO] GT:AD:DP:MIN_DP:PL:SB [other_samples] ./.:66,0,36,0:102:.:1057,0,1856,1039,2052,4115,941,1925,3051,2847:51,15,27,9 [other_samples]
1 26707944 . A G 3169.63 . [INFO] [other_samples] 0/1:66,0:102:99:1057,0,1856 [other_samples]
2) Incorrect AD is taken while genotyping gvcf files:
- single sample gvcf:
1 1247185 rs142783360 AG A,<NON_REF> 577.73 . [INFO] GT:AD:DP:GQ:PL:SB 0/1:13,20,0:33:99:615,0,361,654,421,1075:7,6,17,3
1 1247185 rs142783360 AG A,<NON_REF> . . [INFO] [other_samples] ./.:13,20,0:33:.:615,0,361,654,421,1075:7,6,17,3 [other_samples]
1 1247185 . AG A 569.95 . [INFO] [other_samples] 0/1:13,0:33:99:615,0,361 [other_samples]
I have found multiple such cases here, and no errors nor warnings in the logs. I checked also with calls that I had done before on these samples, but in a smaller batch. There the AD values were correct, but there were plenty of other hetz with AD=X,0... I haven't looked closer into those.
Are these bugs that have been fixed in 3.3? Or maybe my brain is not working properly today and I miss sth obvious?