Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Higher concordance between genotypes representing reference homozygotes than other

Hi,
I sequenced two times 66 newt individuals in few hundred loci to a moderate coverage (45 on average). Now I want to check genotype concordance in various depth classes to answer the question which coverage is enough to call genotypes properly. I have found that using all genotypes in variant sites yields genotype concordance of 0.994 with coverage 8 whereas when excluding genotypes in which both individuals are Hom_REF and thus calculating 1-Non Reference Discrepancy the result in the same coverage class is 0.966. The difference holds also for higher coverage classes.
So it seems that there is higher concordance between genotypes representing reference homozygotes.
Why is it so?
BTW: I’m using GATK Unified Genotyper with standard settings but with mbq set to 20 and pcr_error_rate to 1.0E-3 and further filters GQ < 20.0, MQRankSum < -12.5, QD < 2.0

Thanks!

Best Answer

Answers

  • PiotrPiotr Member

    Hi Valentin,
    Thank you for fast answer! Regarding this prior, since there is PL not GP in the genotype fields I thought that there no prior taken when calling genotypes. I wonder if this prior might also influence the probability of calling HET when calling SNP’s in hybrid zone where I have a lot of HOM_REF and HOM VAR?
    Yes, the effect is less pronounced in the whole data set (NRD=0.012). I didn’t check the concordance in HET and HOM_VAR separately but I’m also curious and I will check it soon and let you know.
    Thanks!

  • PiotrPiotr Member

    Hi Valentin,
    I checked the concordance in my data set starting from DP=8, for biallelic positions only. So the concordance is HET=0.985, HOM_VAR=0.996 and HOM_REF=0.999. So it seems that heterozygotes are hardest to call concordantly and still some prior effect is visible. Nevertheless the concordance seem to be pretty high.
    Cheers!

Sign In or Register to comment.