Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Higher concordance between genotypes representing reference homozygotes than other
I sequenced two times 66 newt individuals in few hundred loci to a moderate coverage (45 on average). Now I want to check genotype concordance in various depth classes to answer the question which coverage is enough to call genotypes properly. I have found that using all genotypes in variant sites yields genotype concordance of 0.994 with coverage 8 whereas when excluding genotypes in which both individuals are Hom_REF and thus calculating 1-Non Reference Discrepancy the result in the same coverage class is 0.966. The difference holds also for higher coverage classes.
So it seems that there is higher concordance between genotypes representing reference homozygotes.
Why is it so?
BTW: I’m using GATK Unified Genotyper with standard settings but with mbq set to 20 and pcr_error_rate to 1.0E-3 and further filters GQ < 20.0, MQRankSum < -12.5, QD < 2.0