Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Optimizing Mutect2 runs on whole genomes?

khadi_NYGCkhadi_NYGC New York Genome CenterMember

Dear GATK,

Given the most current optimized way to run Mutect2 on whole genomes of about 40-60X coverage (~300 G) , how long can I expect it to run on one whole genome sequence? Particularly, what would be the most optimal parameters or practices that you have for generating panel of normals from Mutect2?

I am running Mutect2 on several whole genomes to generate a PON with the multi-thread option -nct 3 per BAM. As of 7 days since starting this job, the run has only completed calls on chromosomes 1 and 2 for one whole genome BAM.

I plan on restarting the run using a scatter-and-gather approach and just split a Mutect2 job on one whole genome into some number of intervals. From my search on the forums, this seems to be the consensus of how best to run Mutect2. However, I would really appreciate any other recommendations.

Thanks!
Kevin

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Kevin, scatter-gather is indeed our method of choice for this. Note that while the implementation of MuTect2 in GATK 3.* is very slow, the development team is currently making a big push to accelerate MuTect2 in the GATK4 framework, which will be released into beta status in about a month and general release probably in late June.

Sign In or Register to comment.