We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Java heap space

I performed GATK to call variants. but got this error:
[April 13, 2019 5:24:50 AM CST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 567.92 minutes.
Runtime.totalMemory()=157205135360
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)

However,I set Xmx240G for java GATK mutect2, with is a real large memory. I also set tmp file, but still report "java.lang.OutOfMemoryError".
There are two ~35G bam1 and ~7.2G bam2 for input, and reference about ~6.5G, 560 thousand scaffolds, because of not well assembled.

Best Answers

Answers

  • liuxiangliuxiang Member
    Does GATK Mutect2 have any specific demand for reference scaffolds number?
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @liuxiang

    Please post the exact command you are using, the version of gatk and the entire error log.

  • liuxiangliuxiang Member
    I used newly GATK4.1 version, and my scripts are:

    java -Xmx300g -jar $GATK Mutect2 \
    --dont-use-soft-clipped-bases true \
    --tmp-dir $cw/$i/tmp \
    --input $DNAbam/ADAR16-DNA-2_NKD180600323/ADAR16-DNA-2_NKD180600323.best.uniq.pair.sort.markdup.bam \
    --input $RNAbam/$i/$i.merge.markdup.reheader.bam \
    --reference $genome\
    --output $cw/$i/$i.dna.rna.vcf \
    --normal-sample ADAR16-DNA-2_NKD180600323 \
    --tumor-sample $i \
    -bamout $cw/$i/$i.support.bam

    and tail of error log are:

    12:05:06.287 INFO ProgressMeter - scaffold23905:111448 948.1 636040 670.9
    12:05:30.519 INFO ProgressMeter - scaffold23905:133852 948.5 636120 670.7
    12:05:57.277 INFO ProgressMeter - scaffold23905:147186 949.0 636170 670.4
    12:24:34.669 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 31261.455155273
    12:24:34.670 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 14618.28 sec
    INFO 2019-04-13 12:45:11 SortingCollection Creating merging iterator from 2 files
    13:30:49.708 INFO Mutect2 - Shutting down engine
    [April 13, 2019 1:30:49 PM CST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 1,035.35 minutes.
    Runtime.totalMemory()=238653800448
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
    at java.lang.StringBuilder.append(StringBuilder.java:136)
    at htsjdk.samtools.SAMTextHeaderCodec.advanceLine(SAMTextHeaderCodec.java:142)
    at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:97)
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.loadDictionary(ReferenceSequenceFileFactory.java:235)
    at htsjdk.samtools.reference.AbstractFastaSequenceFile.<init>(AbstractFastaSequenceFile.java:68)
    at htsjdk.samtools.reference.AbstractIndexedFastaSequenceFile.<init>(AbstractIndexedFastaSequenceFile.java:60)
    at htsjdk.samtools.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:80)
    at htsjdk.samtools.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:98)
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getReferenceSequenceFile(ReferenceSequenceFileFactory.java:138)
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:134)
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:111)
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:96)
    at org.broadinstitute.hellbender.engine.ReferenceFileSource.<init>(ReferenceFileSource.java:35)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.makeStandardMutect2PostFilterReadTransformer(Mutect2Engine.java:171)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.makePostReadFilterTransformer(Mutect2.java:267)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:270)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:984)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)

    Issue · Github
    by bhanuGandham

    Issue Number
    5900
    State
    open
    Last Updated
    Assignee
    Array
  • xiuczxiucz Member

    @bhanuGandham

    I also got the similar error with my WGS data, although I increased the java memory. Command and error showed:

    ~/gatktools/gatk-4.1.0.0/gatk \
    --java-options "-XX:+PrintFlagsFinal -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:gc_log.log  -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx60G -Xms60G" FilterByOrientationBias \
    -V mutect2_oncefilter_sample1.vcf \
    -AM G/T \
    -P sample1.tumor_artifact.pre_adapter_detail_metrics.txt \
    -O mutect2_twicefilter_sample1.vcf
    
    Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
    
  • liuxiangliuxiang Member
    Actually ,I try "gatk launcher script" instead of "java -jar directly", however it still report error:
    "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space"

    here is my "gatk launcher script":
    ~/gatktools/gatk-4.1.0.0/gatk --java-options "-Xmx80G" Mutect2 \
    --tmp-dir $cw/$i/tmp \
    --input $DNAbam/WT8-DNA-2_NKD180600321/cut/group.1.bam \
    --input $RNAbam/WT8-nor_TKR180601295/cut/group.1.bam \
    --reference $genome \
    --output $cw/$i/$i.dna.rna.vcf \
    --normal-sample WT8-DNA-2_NKD180600321 \
    --tumor-sample WT8-nor_TKR180601295 \
    -bamout $cw/$i/$i.support.bam
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited October 2019

    @xiucz and @liuxiang

    I have created a bug report for our dev team for this error. You can track the progress of this issue here: https://github.com/broadinstitute/gatk/issues/5900

    Thank you for bringing this to our notice.

    Note: You shouldn't be running GATK4 using java -jar directly. You should use the included gatk launcher script, which sets a lot of important configuration settings, some of which have a major effect on tool performance.

    Post edited by bhanuGandham on
  • cczulhcczulh Changzhou, ChinaMember
    edited October 2019
    I also run into this errors. My command is as below:

    gatk --java-options '-Xms681m -Xmx9600m -XX:+UseSerialGC -Djava.io.tmpdir=/ngs/final/chenmf' -R /bio/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa -I chenmf-ready.bam -max-mnp-distance 0 -O normal1.vcf.gz

    I have looked github https://github.com/broadinstitute/gatk/issues/5900 and found this issue is still unsolved. I wonder if there is any other method to tackle this problem?
  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Can you copy and paste the full stack trace @cczulh ? Thank you!

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    @liuxiang what's the size of the .dict file that accompanies the fasta ref?
    @cczulh if you provide more info, we can make a recommendation.

  • xingaulagxingaulag VietnamMember
    edited January 2
    I also got this error. My command: gatk HaplotypeCaller -R refgen/resources-broad-hg38-v0-Homo_sapiens_assembly38.fasta -I sort38/VT1_sorted.bam -O mark38VT1.g.vcf.gz -ERC GVCF

    I use reference genome on GATK resource bundle for hg38.
    Error:
    Using GATK jar /home/thanh/Desktop/gatk/work/gatk/gatk-package-4.1.4.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/thanh/Desktop/gatk/work/gatk/gatk-package-4.1.4.1-local.jar HaplotypeCaller -R refgen/resources-broad-hg38-v0-Homo_sapiens_assembly38.fasta -I sort38/VT1_sorted.bam -O mark38VT1.g.vcf.gz -ERC GVCF
    10:34:08.586 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/thanh/Desktop/gatk/work/gatk/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Jan 02, 2020 10:34:08 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    10:34:08.669 INFO HaplotypeCaller - ------------------------------------------------------------
    10:34:08.669 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.1.4.1
    10:34:08.669 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
    10:34:08.669 INFO HaplotypeCaller - Executing as [email protected] on Linux v5.0.0-37-generic amd64
    10:34:08.669 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v11.0.5+10-post-Ubuntu-0ubuntu1.118.04
    10:34:08.669 INFO HaplotypeCaller - Start Date/Time: January 2, 2020 at 10:34:08 AM ICT
    10:34:08.669 INFO HaplotypeCaller - ------------------------------------------------------------
    10:34:08.669 INFO HaplotypeCaller - ------------------------------------------------------------
    10:34:08.670 INFO HaplotypeCaller - HTSJDK Version: 2.21.0
    10:34:08.670 INFO HaplotypeCaller - Picard Version: 2.21.2
    10:34:08.670 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    10:34:08.670 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    10:34:08.670 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    10:34:08.670 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    10:34:08.670 INFO HaplotypeCaller - Deflater: IntelDeflater
    10:34:08.670 INFO HaplotypeCaller - Inflater: IntelInflater
    10:34:08.670 INFO HaplotypeCaller - GCS max retries/reopens: 20
    10:34:08.670 INFO HaplotypeCaller - Requester pays: disabled
    10:34:08.670 INFO HaplotypeCaller - Initializing engine
    10:34:08.965 INFO HaplotypeCaller - Done initializing engine
    10:34:08.966 INFO HaplotypeCallerEngine - Tool is in reference confidence mode and the annotation, the following changes will be made to any specified annotations: 'StrandBiasBySample' will be enabled. 'ChromosomeCounts', 'FisherStrand', 'StrandOddsRatio' and 'QualByDepth' annotations have been disabled
    10:34:09.001 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
    10:34:09.001 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
    10:34:09.007 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/thanh/Desktop/gatk/work/gatk/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_utils.so
    10:34:09.009 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/thanh/Desktop/gatk/work/gatk/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
    10:34:09.033 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
    10:34:09.033 INFO IntelPairHmm - Available threads: 14
    10:34:09.033 INFO IntelPairHmm - Requested threads: 4
    10:34:09.034 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
    10:34:09.074 INFO ProgressMeter - Starting traversal
    10:34:09.074 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
    10:34:19.075 INFO ProgressMeter - chr1:23942701 0.2 79810 478860.0
    10:34:29.074 INFO ProgressMeter - chr1:50984701 0.3 169950 509850.0
    10:34:39.074 INFO ProgressMeter - chr1:78212701 0.5 260710 521420.0
    10:34:49.074 INFO ProgressMeter - chr1:105479701 0.7 351600 527400.0
    10:34:59.074 INFO ProgressMeter - chr1:133559701 0.8 445200 534240.0
    10:35:09.074 INFO ProgressMeter - chr1:160853701 1.0 536180 536180.0
    10:35:19.074 INFO ProgressMeter - chr1:188786701 1.2 629290 539391.4
    10:35:29.075 INFO ProgressMeter - chr1:216935701 1.3 723120 542333.2
    10:35:39.075 INFO ProgressMeter - chr1:243032701 1.5 810110 540067.3
    10:35:49.075 INFO ProgressMeter - chr2:15709201 1.7 882220 529326.7
    10:35:59.075 INFO ProgressMeter - chr2:44161201 1.8 977060 532937.0
    10:36:09.075 INFO ProgressMeter - chr2:72010201 2.0 1069890 534940.5
    10:36:19.075 INFO ProgressMeter - chr2:100390201 2.2 1164490 537452.8
    10:36:29.075 INFO ProgressMeter - chr2:128218201 2.3 1257250 538817.6
    10:36:39.075 INFO ProgressMeter - chr2:156379201 2.5 1351120 540444.4
    10:36:49.075 INFO ProgressMeter - chr2:183568201 2.7 1441750 540652.9
    10:36:59.075 INFO ProgressMeter - chr2:211648201 2.8 1535350 541885.0
    10:37:09.075 INFO ProgressMeter - chr2:238438201 3.0 1624650 541547.0
    10:37:14.483 INFO HaplotypeCaller - Shutting down engine
    [January 2, 2020 at 10:37:14 AM ICT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 3.10 minutes.
    Runtime.totalMemory()=6169821184
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at org.apache.commons.math3.util.MathArrays.copyOfRange(MathArrays.java:827)
    at org.apache.commons.math3.stat.descriptive.rank.Percentile.copyOf(Percentile.java:479)
    at org.apache.commons.math3.stat.descriptive.rank.Percentile.removeAndSlice(Percentile.java:527)
    at org.apache.commons.math3.stat.descriptive.rank.Percentile.getWorkArray(Percentile.java:455)
    at org.apache.commons.math3.stat.descriptive.rank.Percentile.evaluate(Percentile.java:351)
    at org.apache.commons.math3.stat.descriptive.rank.Percentile.evaluate(Percentile.java:302)
    at org.apache.commons.math3.stat.descriptive.AbstractUnivariateStatistic.evaluate(AbstractUnivariateStatistic.java:121)
    at org.broadinstitute.hellbender.utils.MathUtils.median(MathUtils.java:839)
    at org.broadinstitute.hellbender.utils.variant.writers.GVCFBlock.getMedianDP(GVCFBlock.java:75)
    at org.broadinstitute.hellbender.utils.variant.writers.HomRefBlock.createHomRefGenotype(HomRefBlock.java:73)
    at org.broadinstitute.hellbender.utils.variant.writers.GVCFBlock.toVariantContext(GVCFBlock.java:49)
    at org.broadinstitute.hellbender.utils.variant.writers.GVCFBlockCombiner.emitCurrentBlock(GVCFBlockCombiner.java:177)
    at org.broadinstitute.hellbender.utils.variant.writers.GVCFBlockCombiner.signalEndOfInput(GVCFBlockCombiner.java:227)
    at org.broadinstitute.hellbender.utils.variant.writers.GVCFWriter.close(GVCFWriter.java:70)
    at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.closeTool(HaplotypeCaller.java:246)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1052)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
    at org.broadinstitute.hellbender.Main.main(Main.java:292)

    I also tried to widen the heap space with vim command but it doesn't work
Sign In or Register to comment.