Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

gnomAD in VQSR

Hi,
I am not sure why no one asked this before but I need help please as I couldn't find sufficient info:
Is it recommended to use gnomAD variants database as known, training and truth set with VariantRecalibrator tool?
If yes, what is the proper prior value then? And is it OK to use the same for the INDEL mode?
Also, in our settings, we extend the sequence just beyond the exons, is it proper to use the exon variants database? Or is it better to use the genome variants database?
I appreciate your help
Best regards
Nawar

Best Answer

  • bhanuGandhambhanuGandham Cambridge MA admin
    Accepted Answer

    Hi @NawarDalila

    1) VQSR was around before gnomAD, so it hasn't been updated to include it (yet).
    2) It's not clear how much using gnomAD would improve the data. So far, gnomAD has been using VQSR so it would create a circular logic in creating the gnomAD call sets (though last time gnomAD was released this was not the case). Overall, it seems like it just hasn't been proven that using gnomAD would improve the VQSR output by enough to justify working it into the process.
    3) I get the impression that at some point it may be included, but we're not there just yet and have not tested this to make any recommendations.

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    Accepted Answer

    Hi @NawarDalila

    1) VQSR was around before gnomAD, so it hasn't been updated to include it (yet).
    2) It's not clear how much using gnomAD would improve the data. So far, gnomAD has been using VQSR so it would create a circular logic in creating the gnomAD call sets (though last time gnomAD was released this was not the case). Overall, it seems like it just hasn't been proven that using gnomAD would improve the VQSR output by enough to justify working it into the process.
    3) I get the impression that at some point it may be included, but we're not there just yet and have not tested this to make any recommendations.

  • NawarDalilaNawarDalila Member

    Thank you very much @bhanuGandham . That was exactly what I needed to know.
    On the same line but different tool, and just to be sure:
    I am using BaseRecalibrator tool with --known-sites dbSNP as well as --known-sites genomAD. Would that be wrong?
    Best/Nawar

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @NawarDalila

    Again if that is not recommended then it is because we have not tested it on our end. You could give it a try on your end and see how the results change. I would love to hear what those results look like.

Sign In or Register to comment.