We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!
BQSR with MuTect2: use it or not ?

Hello,
I've been reading some threads on the forum about BQSR with MuTect2. I know it has been proposed in Best-Practices uses. However, there were a lot of mixed comments and I can't find a clear conclusion on whether to use BQSR with MuTect2 since MuTect2 takes into consideration the base quality score, and that's what BQSR does. I am working on 18 human samples matched normal and tumor. Those samples have been exome-sequenced. I am using MuTect2 from GATK 3.7 stable version. I generated results using the proposed pipeline here. I used the following inputs:
- Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
- dbsnp_138.hg19.vcf
- hg19_ref_genome.fa
Following this thread here for example, I am worried that potential true variants could be altered due to recalibration.
I also have another doubt, in BQSR thread, I just want to make sure that BQSR does NOT change the base of the variant itself but it just assigns a low base quality score if it gets recalibrated.
I have analyzed commands ran by The Cancer Genome Atlas and they actually use BQSR in their workflow. So finally, I would like to know if it safe to use BQSR with MuTect2 ? It is better to have multiple dbSNPs to avoid having mismatches of potential variants (for example, I have downloaded from NCBI all kwown SNPs of the human ~ 57GB vcf file) ?
Thank you in advance !
Best Answers
-
Geraldine_VdAuwera Cambridge, MA admin
We do recommend running BQSR for cancer samples, yes. The BaseRecalibrator only re-evaluates base quality scores, and does note ever change the base calls themselves.
If you're worried about high mutation rates you can include Cosmic as a known sites resource. -
Geraldine_VdAuwera Cambridge, MA admin
Yes that's what I meant -- add them in addition to dbsnp. You're welcome!
Answers
If you're worried about high mutation rates you can include Cosmic as a known sites resource.
@Geraldine_VdAuwera thank you for your reply. I got one more question, do you mean including the Cosmic file during the BQSR first and second step of the recalibration as known sites resource ? Thank you !
Yes that's what I meant -- add them in addition to dbsnp. You're welcome!
Hello. Unlike germline variants, somatic variants are sporadic across the genome, and rarely re-occur at the same position. dbSNP is a database of positions where we are most likely to find germline events, and hence ignored by BQSR. But cosmic is not a database of "positions where we are most likely to find somatic events". Recurrently somatic mutated "hotspots" in cancer are important, but they are the exception - most somatic mutations are spread out randomly across the genome, and BQSR would treat them as artifacts - and re-evaluate their base quality scores. I do not think BQSR should be in the best-practices for a somatic variant calling pipeline. Let us know otherwise.
Maybe MuTect2 is smart enough to use the uncalibrated BQ scores that BQSR preserves in the BAM file, but most somatic variant callers will use the recalibrated scores and suffer from (hopefully minor) loss in sensitivity.
Hey @cyriac
You made a very good point and so I reached out to our dev team and this what they had to say:
Thanks @bhanuGandham - the devs appear to acknowledge that recall rate in hypermutated tumors may be affected, but also makes a good case that the effect is minimal. For reference, this concern arose ~5 years ago on biostars, and this should finally put it to rest.