Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Inbreeding coefficient calculation documentation available?

Is there a description available for the inbreeding coefficient calculation used for variant recalibration?

An overview is found here:
https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_annotator_InbreedingCoeff.php

and that page points to here for the details for the method:
document on statistical tests

but there is no description of the inbreeding coefficient (just the rank sum test).

Issue · Github
by Sheila

Issue Number
931
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @jfarrell
    Hi,

    I did not get the chance to finish up the statistics documentation for the annotations. I will look into inbreeding coefficient and get back to you.

    -Sheila

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @jfarrell
    Hi,

    From @gauthier:
    The InbreedingCoeff is 1-(# observed hets)/(# expected hets), where we estimate the population allele frequency from the sample genotypes. Number of expected hets comes from the random mating assumption and the proportion of ref and alt alleles in the population, so it's just 2Prob(ref from parent1)Prob(alt from parent1) = 2pq. (Two is for two outcomes -- alt from mom or alt from dad.) Negative values of InbreedingCoeff mean we have too many hets and suggest a site with bad mapping, which is why we filter out variants with the most negative InbreedingCoeffs for ExAC. (Positive values of IC could arise from admixture of different ethnic populations as in ExAC, e.g. Finns are all hom var but Taiwanese are all hom ref.)

    Also, you can refer to this paper another developer pointed out: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1199378/

    I will fix the documentation asap.

    -Sheila

  • jfarrelljfarrell Member ✭✭

    Thanks! How robust is this filter to the assumption of being unrelated? Would it still work for a large number of family samples? Or should it be best calculated based only on the unrelated individuals in the sequenced sample.

Sign In or Register to comment.