Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

RealignerTargetCreator heap error on Ubuntu 14.04.3 but works on RedHat 6.6 and Mac 10.10.4

cgilliescgillies Ann ArborMember

Hi,

I have a GATK pipeline that I was testing across multiple platforms and I am getting an error on my Ubuntu 14.04.3 test virtual machine. The strange issue is that the same command with the same files works on both my RedHat 6.6 machine and my Mac 10.10.4 machine. There was over 5 gigabytes of free memory on the Ubuntu machine when I ran this command (see below). I originally thought it maybe due to the Ubuntu machine having openJDK but the command still does not work with Oracle Java. The bam is tiny only 23 mb and the realignment is occurring only over the exons of 4 genes so a memory error does not make sense. I am using the latest version of GATK. I am using Oracle Java 1.70_79-b15.

Do you have any ideas of the cause of this?

Thanks,

Chris

java -Xmx2048m -jar /home/cgillies/sequencing_programs/GenomeAnalysisTK.jar -T RealignerTargetCreator -R /home/cgillies/sequencing_reference_files/hs37d5.fa -L /home/cgillies/FluidigmTestData/align/genes.intervals -I /home/cgillies/FluidigmTestData/align/27118.bam --known /home/cgillies/sequencing_reference_files/Mills_and_1000G_gold_standard.indels.hg19.sites.relabel.vcf.gz --known /home/cgillies/sequencing_reference_files/1000G_phase1.indels.hg19.sites.relabel.vcf.gz -o /home/cgillies/FluidigmTestData/align/27118.interval_list

INFO  13:48:48,756 HelpFormatter - --------------------------------------------------------------------------------- 
INFO  13:48:48,758 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-46-gbc02625, Compiled 2015/07/09 17:38:12 
INFO  13:48:48,758 HelpFormatter - Copyright (c) 2010 The Broad Institute 
INFO  13:48:48,758 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
INFO  13:48:48,761 HelpFormatter - Program Args: -T RealignerTargetCreator -R /home/cgillies/sequencing_reference_files/hs37d5.fa -L /home/cgillies/FluidigmTestData/align/genes.intervals -I /home/cgillies/FluidigmTestData/align/27118.bam --known /home/cgillies/sequencing_reference_files/Mills_and_1000G_gold_standard.indels.hg19.sites.relabel.vcf.gz --known /home/cgillies/sequencing_reference_files/1000G_phase1.indels.hg19.sites.relabel.vcf.gz -o /home/cgillies/FluidigmTestData/align/27118.interval_list 
INFO  13:48:48,764 HelpFormatter - Executing as [email protected] on Linux 3.19.0-25-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_79-b15. 
INFO  13:48:48,764 HelpFormatter - Date/Time: 2015/09/16 13:48:48 
INFO  13:48:48,764 HelpFormatter - --------------------------------------------------------------------------------- 
INFO  13:48:48,764 HelpFormatter - --------------------------------------------------------------------------------- 
INFO  13:48:49,193 GenomeAnalysisEngine - Strictness is SILENT 
INFO  13:48:49,278 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
INFO  13:48:49,284 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
INFO  13:48:49,325 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.4-46-gbc02625): 
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program.  See the -Xmx JVM argument to adjust the maximum heap size provided to Java
##### ERROR ------------------------------------------------------------------------------------------

Answers

  • cgilliescgillies Ann ArborMember

    I fixed the issue by reindexing the known vcfs

    tabix -f -pvcf /home/cgillies/sequencing_reference_files/Mills_and_1000G_gold_standard.indels.hg19.sites.relabel.vcf.gz
    tabix -f -pvcf /home/cgillies/sequencing_reference_files/1000G_phase1.indels.hg19.sites.relabel.vcf.gz
    

    I hope this post helps someone else that gets the error:

    ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program.  See the -Xmx JVM argument to adjust the maximum heap size provided to Java
    

    I think the error is related to the tabix indexes being out of date. If you copy the vcfs and tbi files from another machine and the tbi file is copies before the vcf, then tabix will think the tbi index it is out of date. It is not clear to me why GATK complains about memory size, but at least it works.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @cgillies
    Hi Chris,

    Thanks for letting us know about your solution. I will keep this in mind next time someone gets this error :smile:

    -Sheila

Sign In or Register to comment.