Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Abnormally high per-sample depth after running GenotypeGVCFs

I first posted this question as a reply to something related (see https://gatkforums.broadinstitute.org/gatk/discussion/7318/what-is-the-significance-of-depth-across-all-samples-dp-in-info#latest) but realized it might not get visibility there.

After running GenotypeGVCFs, I'm seeing that the per-sample depth (DP in FORMAT) reported is consistently much higher than the sample depths reported in the original gvcfs. For example:

From the vcf file, looking at sample 68148-2:
GT:AD:DP:GQ:PGT:PID:PL 0/0:522,0:522:0:.:.:0,0,13060

From the same site in 68148-2.g.vcf
GT:DP:GQ:MIN_DP:PL 0/0:13:36:12:0,36,376

As you can see, the depth in the vcf file is WAY higher. This is found for all samples at this site, and at a glance seems to be occurring at most of the sites in the vcf. Why might the depth scores be so inflated after genotypegvcfs?

I don't know if this helps, but I checked a vcf produced from genotypegvcfs on a subset of my population, and I see that the same inflated depths are being reported. I also get the same result using GATK versions 3.7 and 3.8.

Thank you for your help!

Answers

Sign In or Register to comment.