We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

BQSR Bootstrap: Combine hard-filtered SNP and Indel VCFs?


I'm trying to run BQSR on mouse WES tumor-normal data according to [1-6]. I'm on the first round of bootstrapping a knownSites.vcf to use for BQSR. I have just separated raw SNP and Indel variants from the rawVariants.vcf produced by HaplotypeCaller and hard filtered both into knownSNPSites1.vcf and knownIndelSites1.vcf according to [6]. My question is this: do I run BaseRecalibrator with just one of the knownSites.vcf files (SNP or Indel)? Do I combine them? Do I run BaseRecalibrator and just use two instances of --known-sites="/path/knownSNPSites1.vcf" --known-sites="/path/knownIndelSites1.vcf"?

I tried searching but couldn't find an exact answer. Thanks in advance.

[1] Data pre-processing for variant discovery. Available at: https://software.broadinstitute.org/gatk/best-practices/workflow?id=11165.
[2] (howto) Recalibrate base quality scores = run BQSR. Available at: https://software.broadinstitute.org/gatk/documentation/article?id=2801.
[3] BaseRecalibrator. Retrieved from https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_bqsr_BaseRecalibrator.php
[4] Base Quality Score Recalibration (BQSR). Retrieved from https://gatkforums.broadinstitute.org/gatk/discussion/44/base-quality-score-recalibration-bqsr
[5] Confused about Bootstrapping a set of known sites for Base Recalibration. Retrieved from https://gatkforums.broadinstitute.org/gatk/discussion/8281/confused-about-bootstrapping-a-set-of-known-sites-for-base-recalibration
[6] (howto) Apply hard filters to a call set. Retrieved from https://gatkforums.broadinstitute.org/gatk/discussion/2806/howto-apply-hard-filters-to-a-call-set

Best Answer


Sign In or Register to comment.