This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Higher concordance between genotypes representing reference homozygotes than other
I sequenced two times 66 newt individuals in few hundred loci to a moderate coverage (45 on average). Now I want to check genotype concordance in various depth classes to answer the question which coverage is enough to call genotypes properly. I have found that using all genotypes in variant sites yields genotype concordance of 0.994 with coverage 8 whereas when excluding genotypes in which both individuals are Hom_REF and thus calculating 1-Non Reference Discrepancy the result in the same coverage class is 0.966. The difference holds also for higher coverage classes.
So it seems that there is higher concordance between genotypes representing reference homozygotes.
Why is it so?
BTW: I’m using GATK Unified Genotyper with standard settings but with mbq set to 20 and pcr_error_rate to 1.0E-3 and further filters GQ < 20.0, MQRankSum < -12.5, QD < 2.0