If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

AD allele depth interpretation

Hello, I have a query on the interpretation of the AD variable in a vcf generated by calling about 800 samples together.
The header defines it as:
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
and the forum further elaborates:
AD is the unfiltered allele depth, i.e. the number of reads that support each of the reported alleles. All reads at the position (including reads that did not pass the variant caller’s filters) are included in this number, except reads that were considered uninformative. Reads are considered uninformative when they do not provide enough statistical evidence to support one allele over another.

However, most of my variants have a depth of 500 - 2000x, and the AD for a position may be ref AD 4 + alt AD 4. I'm not sure how these values fit the definition, as surely they should total to be approximately the high depth? Viewing the position on individual bams in IGV confirms that there are many more reads with the ref and alt alleles, so even if it were filtering out a lot of them (which I doubt is the case), it would list higher values than these? Perhaps I am misunderstanding the definition here, and if so, how would I go about getting the number of reads that display ref/alt for the position of interest in the vcf file?

I've tried this using both UnifiedGenotyper in GATK3.8-1 and HaplotypeCaller in GATK4.0.4.0.


  • Amir_AriffAmir_Ariff Member
    I seem to have formatted out the ##FORMAT line, which should read:
    ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed"
Sign In or Register to comment.