Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Outputting incorrect AD fields
I am running the GATK 3.2-2 pipeline and I am getting some odd results from the genotype VCF stage. It appears that incorrect AD results are being output.
E.g. on the single sample gvcf I have:
GC G, 522.73 . BaseQRankSum=-0.716;ClippingRankSum=0.219;DP=38;MLEAC=1,0;MLEAF=0.500,0.00;MQ=66.41;MQ0=0;MQRankSum=1.681;ReadPosRankSum=-1.155 GT:AD:DP:GQ:PL:SB 0/1:20,18,0:38:99:560,0,633,620,687,1307:10,10,10,8
This agrees with what I see in IGV for this site (see attached image).
However, when I joint call this site along with other gvcfs, the output for that sample looks like:
i.e. It appears to not output the expected 20,18 - but it still seems to be making the correct genotype call so I am assuming it is just an output bug and not something that is affecting calling, but it would be good to know for sure. Either way it is problematic as it is affecting our downstream depth filtering. This isn't the only example of this happening in our data. I am trying to identify more by looking for occasions where the AD fields sum to less than DP, am I right in thinking this should never be the case as AD is unfiltered compared to the filtered DP?
Thanks a lot