Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

BQSR Bootstrap: Combine hard-filtered SNP and Indel VCFs?


I'm trying to run BQSR on mouse WES tumor-normal data according to [1-6]. I'm on the first round of bootstrapping a knownSites.vcf to use for BQSR. I have just separated raw SNP and Indel variants from the rawVariants.vcf produced by HaplotypeCaller and hard filtered both into knownSNPSites1.vcf and knownIndelSites1.vcf according to [6]. My question is this: do I run BaseRecalibrator with just one of the knownSites.vcf files (SNP or Indel)? Do I combine them? Do I run BaseRecalibrator and just use two instances of --known-sites="/path/knownSNPSites1.vcf" --known-sites="/path/knownIndelSites1.vcf"?

I tried searching but couldn't find an exact answer. Thanks in advance.

[1] Data pre-processing for variant discovery. Available at: https://software.broadinstitute.org/gatk/best-practices/workflow?id=11165.
[2] (howto) Recalibrate base quality scores = run BQSR. Available at: https://software.broadinstitute.org/gatk/documentation/article?id=2801.
[3] BaseRecalibrator. Retrieved from https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_bqsr_BaseRecalibrator.php
[4] Base Quality Score Recalibration (BQSR). Retrieved from https://gatkforums.broadinstitute.org/gatk/discussion/44/base-quality-score-recalibration-bqsr
[5] Confused about Bootstrapping a set of known sites for Base Recalibration. Retrieved from https://gatkforums.broadinstitute.org/gatk/discussion/8281/confused-about-bootstrapping-a-set-of-known-sites-for-base-recalibration
[6] (howto) Apply hard filters to a call set. Retrieved from https://gatkforums.broadinstitute.org/gatk/discussion/2806/howto-apply-hard-filters-to-a-call-set

Best Answer


Sign In or Register to comment.