Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
HC listing depth one read less
I ran haplotypecaller on a bunch of samples using the following commands:
java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -drf DuplicateRead -R hg19.fa -I SAMPLE.bam -o SAMPLE.g.vcf -L target_region.bed -ERC GVCF
java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R hg19.fa -V SAMPLE.g.vcf -o SAMPLE.hc.vcf
For many variants, DP is listed to be one read less than it actually is. I load the bam file in IGV and count the reads manually (also appears when I hover over the bar plot). Moreover, the correct depth is listed by the output of the DepthOfCoverage tool:
java -jar GenomeAnalysisTK.jar -T DepthOfCoverage -drf DuplicateRead -R hg19.fa -I SAMPLE.bam --omitDepthOutputAtEachBase -o SAMPLE.coverage
Here is an example from hc.vcf:
chrX 49119876 . T C 531.77 . AC=2;AF=1.00;AN=2;DP=19;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=27.99;SOR=1.022 GT:AD:DP:GQ:PL 1/1:0,19:19:57:560,57,0
and from DepthOfCoverage:
chrX:49119876 20 20.00 20 20.00 21 21 21 100.0
So, depth matches between DepthOfCoverage and bam+IGV (DP=20), but it is one less in hc.vcf (DP=19).
Has anybody else seen this issue or know how to fix it? This is giving me problems for the variants that are right at my threshold.
Thanks a lot in advance!