This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Excessive memory usage with MuTect2
I am trying to use MuTect2 for somatic variant discovery. I am running GATK v22.214.171.124 with Java HotSpot(TM) 64-Bit Server VM v1.8.0_112-b15.
When running either a single sample (tumor only mode to generate a panel of normals) or with a tumor and a normal sample, the memory usage is very high > 400 GB RAM. This is not the case initially, but memory usage gradually climbs during the run. The data are not whole genome sequence data, but rather are RADseq/GBS data. This means much of the genome is not covered by reads, but where there are reads, they start and stop in similar places and cover a ~85 bp region with moderate coverage (around 10X on average). Here is an example of the command I am running (note that I have made some modifications to the standard command to add more memory and obtain additional information for debugging):
java -Xmx384g -XX:-UseGCOverheadLimit -jar ~/bin/gatk-package-126.96.36.199-local.jar Mutect2 -R /uufs/chpc.utah.edu/common/home/u6000989/data/aspen/genome/Potrs01-genome.fa -I aln_mem_mod_003-S.uniqe.bam -I aln_mem_mod_013-S.uniqe.bam -normal potr-mod_013-S --independent-mates --max-mnp-distance 0 -debug --dont-increase-kmer-sizes-for-cycles -O somatic.vcf.gz
The run generates a vcf file that doesn't have any obvious errors for the regions of the genome it gets to, but fails to finish before running out of memory.
I have tried the identical command on a different data set with whole genome sequences and do not see the same memory issue. Thus, I think the problem with memory usage stems from the RADseq/GBS data. With that said, I don't know what about RADseq/GBS data would cause such a problem. Additionally, the reference genome I am using for aligning the RADseq/GBS data is highly fragmented (most contigs ~10 kb). Are there any modifications I might be able to make to the command I am running that could solve this problem?