Inbreeding coefficient calculation documentation available?

Is there a description available for the inbreeding coefficient calculation used for variant recalibration?

An overview is found here:

and that page points to here for the details for the method:
document on statistical tests

but there is no description of the inbreeding coefficient (just the rank sum test).

  Sheila Broad Institute


    I did not get the chance to finish up the statistics documentation for the annotations. I will look into inbreeding coefficient and get back to you.


  Sheila Broad Institute


    From @gauthier:
    The InbreedingCoeff is 1-(# observed hets)/(# expected hets), where we estimate the population allele frequency from the sample genotypes. Number of expected hets comes from the random mating assumption and the proportion of ref and alt alleles in the population, so it's just 2Prob(ref from parent1)Prob(alt from parent1) = 2pq. (Two is for two outcomes -- alt from mom or alt from dad.) Negative values of InbreedingCoeff mean we have too many hets and suggest a site with bad mapping, which is why we filter out variants with the most negative InbreedingCoeffs for ExAC. (Positive values of IC could arise from admixture of different ethnic populations as in ExAC, e.g. Finns are all hom var but Taiwanese are all hom ref.)

    Also, you can refer to this paper another developer pointed out:

    I will fix the documentation asap.


  jfarrell

    Thanks! How robust is this filter to the assumption of being unrelated? Would it still work for a large number of family samples? Or should it be best calculated based only on the unrelated individuals in the sequenced sample.

