Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Inbreeding coefficient

Hi GATK team,
I have a question about VQSR. For "VariantRecalibrator" I used as annotations: -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an InbreedingCoeff -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0; for "ApplyRecalibration" --ts_filter_level 99.0. I have an exome dataset of more than 200 samples but samples belong to at least 15 families. I wonder if I was wrong to use InbreedingCoeff as annotation and how much it could affect the final number of PASS variants.
Thanks

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Sheila_25
    Hi,

    Nice username :smile:

    Have a look at this thread and the documentation.

    There is also a new annotation called ExcessHet that you may find useful as well. But, please note that this will still violate some assumptions of HWE

    -Sheila

  • Sheila_25Sheila_25 Member
    edited November 2016

    Hi Sheila, thanks for your answer. By applying the above annotations I obtained more than 700,000 PASS variants for 280 samples (total variants number 840,000, with more or less 120,000 variants/sample). By checking SNPs plots (Inbreeding coeff vs MQ) I noted that Inbreeding coeff is around 0 for many variants but because of MQ < 50 they are filtered out. Is there a way to quantify how many variants have been filtered because of inbreeding coeff?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Sheila_25
    Hi,

    There is an annotation called "culprit" in the output of VQSR. The culprit annotation is present for all sites and tells you the annotation that had the worst score for the site. To find how many sites failed because of InbreedingCoeff, you can select the failing sites from your VCF that have culprit=InbreedingCoeff.

    -Sheila

Sign In or Register to comment.