If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Variants with AD 0,0 and DP 0
I was investigating why some of the variants in my vcf produced with HaplotypeCaller miss some fields such as QD, DP, MQ, MQRankSum, BaseQRankSum.
I understood why by searching this forum, some of these values need other parameter to be present to be calculated, such as AD which is needed to calculate QD and so on.
So I realized that for all the variants that miss these fields I have the same GT:AD:DP values of 1/1:0,0:0, therefore AD is always 0,0 and DP 0
I read about informative and uninformative reads; some posts suggest that AD do not include uninformative reads, but DP does. By reading the documentation of DepthPerSampleHC seems like that DP in the FORMAT only includes informative reads like AD, but in the INFO has all unfiltered reads supporting that call.
why there is not the DP tag in the INFO column but only in the FORMAT for some calls?
why are these variants called if AD:DP 0,0:0 and DP in the INFO is missing?
I report an example:
Qrob_H2.3_Sc0001210 5578 . G T 18.59 . AC=2;AF=1.00;AN=2;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;SOR=0.693 GT:AD:DP:GQ:PL 1/1:0,0:0:3:45,3,0
A call with complete info is:
Qrob_H2.3_Sc0001407 1861 . A G 739.78 . AC=1;AF=0.500;AN=2;BaseQRankSum=2.371;DP=22;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=32.87;MQRankSum=-1.787;QD=30.87;ReadPosRankSum=0.240;SOR=1.609 GT:AD:DP:GQ:PL 0/1:2,19:21:27:768,0,27
It is only one sample, I know it is not enough; I am actually trying and learning how to use the variant calling pipeline for the first time before I run it on all my data; it just a test but I would like to understand this.
I apologize if this is a repetitive question; I have searched the forums and got a few hints but did not obtain a satisfactory answer.