Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Inbreeding coefficient calculation documentation available?

Is there a description available for the inbreeding coefficient calculation used for variant recalibration?

An overview is found here:
https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_annotator_InbreedingCoeff.php

and that page points to here for the details for the method:
document on statistical tests

but there is no description of the inbreeding coefficient (just the rank sum test).

Issue · Github
by Sheila

Issue Number
931
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @jfarrell
    Hi,

    I did not get the chance to finish up the statistics documentation for the annotations. I will look into inbreeding coefficient and get back to you.

    -Sheila

  • SheilaSheila Broad InstituteMember, Broadie admin

    @jfarrell
    Hi,

    From @gauthier:
    The InbreedingCoeff is 1-(# observed hets)/(# expected hets), where we estimate the population allele frequency from the sample genotypes. Number of expected hets comes from the random mating assumption and the proportion of ref and alt alleles in the population, so it's just 2Prob(ref from parent1)Prob(alt from parent1) = 2pq. (Two is for two outcomes -- alt from mom or alt from dad.) Negative values of InbreedingCoeff mean we have too many hets and suggest a site with bad mapping, which is why we filter out variants with the most negative InbreedingCoeffs for ExAC. (Positive values of IC could arise from admixture of different ethnic populations as in ExAC, e.g. Finns are all hom var but Taiwanese are all hom ref.)

    Also, you can refer to this paper another developer pointed out: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1199378/

    I will fix the documentation asap.

    -Sheila

  • jfarrelljfarrell Member ✭✭

    Thanks! How robust is this filter to the assumption of being unrelated? Would it still work for a large number of family samples? Or should it be best calculated based only on the unrelated individuals in the sequenced sample.

Sign In or Register to comment.