Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Depth Reporting in DP and AD changes when VariantAnnotator run


I am trying to filter some of my high-coverage samples based on a minimum depth and have found that the value stored in the DP INFO field and the AD genotype tag changes depending on whether or not I have run VariantAnnotator. The call I have used for VariantAnnotator is:

java -jar GenomeAnalysisTK.jar -T VariantAnnotator -R ucsc.hg19.fasta -I example.bam --variant example.raw.vcf --out example.annotated.vcf -G StandardAnnotation -L example.raw.vcf -rf BadCigar -dcov 15000

Here are the differences for some test cases with HaplotypeCaller:

No MarkDuplicates, did IndelRealigner & BQSR, nightly build 12/04/2013

Annotated: DP=2745, AD=4,2729

Raw: DP=957, AD=1,907

MarkDuplicates, IndelRealigner and BQSR, nightly build 12/04/2013

Annotated: DP=20, AD=0,20

Raw: DP=10, AD=0,8

Raw BAM, nightly build 12/04/2013

Annotated: DP=2745,AD=4,2729

Raw: DP=868, AD=1,864

Raw BAM, version 2.4-9

Annotated: DP=2745, AD=4,2729

Raw: DP=616, AD=1,611

I suspect what is happening here is that VariantAnnotator is taking the depth information from the provided BAM and replacing the depth information reported by the variant caller. Anyway, just wondering- which value is a better reflection of the depth used to make a given variant call? (i.e. which could I use in hard filtering?)

Thanks for your help!

Best Answer


  • nhvanlienhvanlie Member

    Great, thank you for confirming! In this case, I suspect we are seeing more genuine duplication (because of small, targeted areas and high coverage) than PCR duplication, so I'll have to do more testing to determine whether I should run MarkDuplicates.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    That's a fair point, if you know the duplicates are legit. From those values it looks like the results are consistent either way.

Sign In or Register to comment.