We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Inbreeding coefficient

Hi GATK team,
I have a question about VQSR. For "VariantRecalibrator" I used as annotations: -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an InbreedingCoeff -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0; for "ApplyRecalibration" --ts_filter_level 99.0. I have an exome dataset of more than 200 samples but samples belong to at least 15 families. I wonder if I was wrong to use InbreedingCoeff as annotation and how much it could affect the final number of PASS variants.
Thanks

Answers

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @Sheila_25
    Hi,

    Nice username :smile:

    Have a look at this thread and the documentation.

    There is also a new annotation called ExcessHet that you may find useful as well. But, please note that this will still violate some assumptions of HWE

    -Sheila

  • Sheila_25Sheila_25 Member
    edited November 2016

    Hi Sheila, thanks for your answer. By applying the above annotations I obtained more than 700,000 PASS variants for 280 samples (total variants number 840,000, with more or less 120,000 variants/sample). By checking SNPs plots (Inbreeding coeff vs MQ) I noted that Inbreeding coeff is around 0 for many variants but because of MQ < 50 they are filtered out. Is there a way to quantify how many variants have been filtered because of inbreeding coeff?

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @Sheila_25
    Hi,

    There is an annotation called "culprit" in the output of VQSR. The culprit annotation is present for all sites and tells you the annotation that had the worst score for the site. To find how many sites failed because of InbreedingCoeff, you can select the failing sites from your VCF that have culprit=InbreedingCoeff.

    -Sheila

Sign In or Register to comment.