If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

using SNP database of known variants from RAD for BQSR on whole genome

Is it a bad idea to use a dataset of known variants determined from RAD seq data to input for BQSR for whole-genome resequencing data? After reading the description of the tool, my understanding is that novel variation present in the whole genome sequence, which is not a previously known variant would be treated as a sequencing error, for the purposes of finding associations between sequencing errors and genomic context, machine cycle, etc. and then the BQSR will adjust quality scores based on these models. Thus, if these variants are not in fact sequencing errors, but are also not associated with any of the putative error covariates, will their quality scores remain fine? Am I correct in this, or is it the case that these novel variants will have their quality scores downgraded simply by virtue of being assumed to be an error, even if they are not found to be associated with putative error covariates?

Thanks in advance for your advice.


Sign In or Register to comment.