To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Abnormally high per-sample depth after running GenotypeGVCFs

I first posted this question as a reply to something related (see https://gatkforums.broadinstitute.org/gatk/discussion/7318/what-is-the-significance-of-depth-across-all-samples-dp-in-info#latest) but realized it might not get visibility there.

After running GenotypeGVCFs, I'm seeing that the per-sample depth (DP in FORMAT) reported is consistently much higher than the sample depths reported in the original gvcfs. For example:

From the vcf file, looking at sample 68148-2:
GT:AD:DP:GQ:PGT:PID:PL 0/0:522,0:522:0:.:.:0,0,13060

From the same site in 68148-2.g.vcf
GT:DP:GQ:MIN_DP:PL 0/0:13:36:12:0,36,376

As you can see, the depth in the vcf file is WAY higher. This is found for all samples at this site, and at a glance seems to be occurring at most of the sites in the vcf. Why might the depth scores be so inflated after genotypegvcfs?

I don't know if this helps, but I checked a vcf produced from genotypegvcfs on a subset of my population, and I see that the same inflated depths are being reported. I also get the same result using GATK versions 3.7 and 3.8.

Thank you for your help!

Answers

Sign In or Register to comment.