I've been reading some threads on the forum about BQSR with MuTect2. I know it has been proposed in Best-Practices uses. However, there were a lot of mixed comments and I can't find a clear conclusion on whether to use BQSR with MuTect2 since MuTect2 takes into consideration the base quality score, and that's what BQSR does. I am working on 18 human samples matched normal and tumor. Those samples have been exome-sequenced. I am using MuTect2 from GATK 3.7 stable version. I generated results using the proposed pipeline here. I used the following inputs:

  • Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
  • dbsnp_138.hg19.vcf
  • hg19_ref_genome.fa

Following this thread here for example, I am worried that potential true variants could be altered due to recalibration.

I also have another doubt, in BQSR thread, I just want to make sure that BQSR does NOT change the base of the variant itself but it just assigns a low base quality score if it gets recalibrated.

I have analyzed commands ran by The Cancer Genome Atlas and they actually use BQSR in their workflow. So finally, I would like to know if it safe to use BQSR with MuTect2 ? It is better to have multiple dbSNPs to avoid having mismatches of potential variants (for example, I have downloaded from NCBI all kwown SNPs of the human ~ 57GB vcf file) ?

    @Geraldine_VdAuwera said:
    We do recommend running BQSR for cancer samples, yes. The BaseRecalibrator only re-evaluates base quality scores, and does note ever change the base calls themselves.

    If you're worried about high mutation rates you can include Cosmic as a known sites resource.

    @Geraldine_VdAuwera thank you for your reply. I got one more question, do you mean including the Cosmic file during the BQSR first and second step of the recalibration as known sites resource ? Thank you !

    Yes that's what I meant -- add them in addition to dbsnp. You're welcome!

