We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Inbreeding coefficient calculation documentation available?

Member ✭✭

Is there a description available for the inbreeding coefficient calculation used for variant recalibration?

and that page points to here for the details for the method:
document on statistical tests

but there is no description of the inbreeding coefficient (just the rank sum test).

Issue · Github April 2015 by Sheila

Issue Number
931
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

@jfarrell
Hi,

I did not get the chance to finish up the statistics documentation for the annotations. I will look into inbreeding coefficient and get back to you.

-Sheila

@jfarrell
Hi,

From @gauthier:
The InbreedingCoeff is 1-(# observed hets)/(# expected hets), where we estimate the population allele frequency from the sample genotypes. Number of expected hets comes from the random mating assumption and the proportion of ref and alt alleles in the population, so it's just 2Prob(ref from parent1)Prob(alt from parent1) = 2pq. (Two is for two outcomes -- alt from mom or alt from dad.) Negative values of InbreedingCoeff mean we have too many hets and suggest a site with bad mapping, which is why we filter out variants with the most negative InbreedingCoeffs for ExAC. (Positive values of IC could arise from admixture of different ethnic populations as in ExAC, e.g. Finns are all hom var but Taiwanese are all hom ref.)

Also, you can refer to this paper another developer pointed out: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1199378/

I will fix the documentation asap.

-Sheila

• Member ✭✭

Thanks! How robust is this filter to the assumption of being unrelated? Would it still work for a large number of family samples? Or should it be best calculated based only on the unrelated individuals in the sequenced sample.