Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Unified Genotyper reports high QUAL at DP=1 invariant sites
I've noticed that when ploidy is set to one, the unified genotyper will report fairly high QUAL scores at sites where the depth of coverage is one read, and that read reports the reference allele. There's a vcf that exemplifies this at:
Pf3D7_01_v3 7045 . T . 32.95 PASS AC=0;AF=0.00;AN=2;DP=1;MQ=17.00;MQ0=0 GT:DP 0/0:1
Pf3D7_01_v3 7046 . G . 8.91 LowConfidence;LowQual AC=0;AF=0.00;AN=2;DP=1;MQ=17.00;MQ0=0 GT:DP 0/0:1
This behavior is somewhat surprising. I realize that the QUAL threshold can simply be raised to filter these sites, but I would expect the QUAL score to be similar to similarly covered variant sites.
Another example is /home/unix/emoss/vcf/individual/deprecated/haploid/SenT001.08.vcf.gz, viewable with
zcat /home/unix/emoss/vcf/individual/deprecated/haploid/SenT001.08.vcf.gz | grep DP=1\; | grep PASS | head
Here again the variable sites with one read gets a failing score, and the invariant sites pass.
These were generated with the latest GATK.
Thanks for reading! This isn't a pressing issue to me now that I've identified it, but it has in the past led to some erroneous metrics of genome coverage and sequencing quality.