Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Java heap space

I performed GATK to call variants. but got this error:
[April 13, 2019 5:24:50 AM CST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 567.92 minutes.
Runtime.totalMemory()=157205135360
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)

However,I set Xmx240G for java GATK mutect2, with is a real large memory. I also set tmp file, but still report "java.lang.OutOfMemoryError".
There are two ~35G bam1 and ~7.2G bam2 for input, and reference about ~6.5G, 560 thousand scaffolds, because of not well assembled.

Best Answers

Answers

  • liuxiangliuxiang Member
    Does GATK Mutect2 have any specific demand for reference scaffolds number?
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @liuxiang

    Please post the exact command you are using, the version of gatk and the entire error log.

  • liuxiangliuxiang Member
    I used newly GATK4.1 version, and my scripts are:

    java -Xmx300g -jar $GATK Mutect2 \
    --dont-use-soft-clipped-bases true \
    --tmp-dir $cw/$i/tmp \
    --input $DNAbam/ADAR16-DNA-2_NKD180600323/ADAR16-DNA-2_NKD180600323.best.uniq.pair.sort.markdup.bam \
    --input $RNAbam/$i/$i.merge.markdup.reheader.bam \
    --reference $genome\
    --output $cw/$i/$i.dna.rna.vcf \
    --normal-sample ADAR16-DNA-2_NKD180600323 \
    --tumor-sample $i \
    -bamout $cw/$i/$i.support.bam

    and tail of error log are:

    12:05:06.287 INFO ProgressMeter - scaffold23905:111448 948.1 636040 670.9
    12:05:30.519 INFO ProgressMeter - scaffold23905:133852 948.5 636120 670.7
    12:05:57.277 INFO ProgressMeter - scaffold23905:147186 949.0 636170 670.4
    12:24:34.669 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 31261.455155273
    12:24:34.670 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 14618.28 sec
    INFO 2019-04-13 12:45:11 SortingCollection Creating merging iterator from 2 files
    13:30:49.708 INFO Mutect2 - Shutting down engine
    [April 13, 2019 1:30:49 PM CST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 1,035.35 minutes.
    Runtime.totalMemory()=238653800448
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
    at java.lang.StringBuilder.append(StringBuilder.java:136)
    at htsjdk.samtools.SAMTextHeaderCodec.advanceLine(SAMTextHeaderCodec.java:142)
    at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:97)
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.loadDictionary(ReferenceSequenceFileFactory.java:235)
    at htsjdk.samtools.reference.AbstractFastaSequenceFile.<init>(AbstractFastaSequenceFile.java:68)
    at htsjdk.samtools.reference.AbstractIndexedFastaSequenceFile.<init>(AbstractIndexedFastaSequenceFile.java:60)
    at htsjdk.samtools.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:80)
    at htsjdk.samtools.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:98)
    at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getReferenceSequenceFile(ReferenceSequenceFileFactory.java:138)
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:134)
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:111)
    at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:96)
    at org.broadinstitute.hellbender.engine.ReferenceFileSource.<init>(ReferenceFileSource.java:35)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.makeStandardMutect2PostFilterReadTransformer(Mutect2Engine.java:171)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2.makePostReadFilterTransformer(Mutect2.java:267)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:270)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:984)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)

    Issue · Github
    by bhanuGandham

    Issue Number
    5900
    State
    open
    Last Updated
    Assignee
    Array
  • xiuczxiucz Member

    @bhanuGandham

    I also got the similar error with my WGS data, although I increased the java memory. Command and error showed:

    ~/gatktools/gatk-4.1.0.0/gatk \
    --java-options "-XX:+PrintFlagsFinal -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:gc_log.log  -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx60G -Xms60G" FilterByOrientationBias \
    -V mutect2_oncefilter_sample1.vcf \
    -AM G/T \
    -P sample1.tumor_artifact.pre_adapter_detail_metrics.txt \
    -O mutect2_twicefilter_sample1.vcf
    
    Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
    
  • liuxiangliuxiang Member
    Actually ,I try "gatk launcher script" instead of "java -jar directly", however it still report error:
    "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space"

    here is my "gatk launcher script":
    ~/gatktools/gatk-4.1.0.0/gatk --java-options "-Xmx80G" Mutect2 \
    --tmp-dir $cw/$i/tmp \
    --input $DNAbam/WT8-DNA-2_NKD180600321/cut/group.1.bam \
    --input $RNAbam/WT8-nor_TKR180601295/cut/group.1.bam \
    --reference $genome \
    --output $cw/$i/$i.dna.rna.vcf \
    --normal-sample WT8-DNA-2_NKD180600321 \
    --tumor-sample WT8-nor_TKR180601295 \
    -bamout $cw/$i/$i.support.bam
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited October 13

    @xiucz and @liuxiang

    I have created a bug report for our dev team for this error. You can track the progress of this issue here: https://github.com/broadinstitute/gatk/issues/5900

    Thank you for bringing this to our notice.

    Note: You shouldn't be running GATK4 using java -jar directly. You should use the included gatk launcher script, which sets a lot of important configuration settings, some of which have a major effect on tool performance.

    Post edited by bhanuGandham on
  • cczulhcczulh Changzhou, ChinaMember
    edited October 15
    I also run into this errors. My command is as below:

    gatk --java-options '-Xms681m -Xmx9600m -XX:+UseSerialGC -Djava.io.tmpdir=/ngs/final/chenmf' -R /bio/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa -I chenmf-ready.bam -max-mnp-distance 0 -O normal1.vcf.gz

    I have looked github https://github.com/broadinstitute/gatk/issues/5900 and found this issue is still unsolved. I wonder if there is any other method to tackle this problem?
  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Can you copy and paste the full stack trace @cczulh ? Thank you!

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    @liuxiang what's the size of the .dict file that accompanies the fasta ref?
    @cczulh if you provide more info, we can make a recommendation.

Sign In or Register to comment.