Recommendations for speeding up MuTect2 on deep capture data?
Hi GATK Team,
I'm trying to move a pipeline over from MuTect1 and Indelocator to MuTect2. The difference in runtime I'm seeing is much larger than I was expecting. I did see this post, and Sheila's answer, but I'd like to ask again with some more specifics.
I'm running on a pair of BAMs aligned with bwa-mem to hs38DH, and calling over a small set of target regions. The target regions are ~375 exons totaling about 57kb - a fairly tiny set of regions. The BAM files have a median coverage of around 500X over these target regions.
Running with MuTect 1 this takes less than a minute and a half. Running with MuTect2 this is taking nearly 30 minutes! I appreciate that MuTect2 is doing local assembly and indel calling, but a 20-fold increase in runtime doesn't seem right. For comparison, running the HaplotypeCaller on these same BAMs takes between 30 and 60 seconds. Again, not an apples-to-apples comparison, just a point of reference.
Any suggestions for why it's taking ~20 times longer to run, and how I might decrease that without tanking sensitivity? Thanks,