The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks (  ) each to make a code block as demonstrated here.

GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

# RealignerTargetCreator issue : not provide enough memory to run this program

Member

Hi,

I want to run RealignerTargetCreator with this command line :

qsub -b Y -N RTC -q bigmem.q "/usr/local/java/latest/bin/java -Xmx36g -jar /home/sabotf/sources/GenomeAnalysisTK/GenomeAnalysisTK.jar -T RealignerTargetCreator -R /data/projects/assembling-glab/PacBio_test/XL/filtered_subreads_XL.fasta -o /data/projects/assembling-glab/mappingResults/Tog5681Clean_vs_CG14_XL/output.intervals -I /data/projects/assembling-glab/mappingResults/Tog5681Clean_vs_CG14_XL/rmdup.bam" 
but this return this error :

##### ERROR ------------------------------------------------------------------------------------------

##### ERROR ------------------------------------------------------------------------------------------

I tried with -Xmx4g, then,-Xmx12g, then -Xmx48g, and always the same error.
I don't know what to do ...
any idea ?
thanks

Tagged:

Hmm, this should run without needing quite that much memory. Can you try with the latest version?

Also, what is the size of your dataset? Is it very large?

• Member

yeah that's what I think, because I already used this tool for bigger data, and it worked with -Xmx4g only ! Sorry how can I know what version I use and where can I recover the latest one ? thanks

The version is stated in the console at the beginning of each run and then again in the error message if there is one, as here:

ERROR A USER ERROR has occurred (version 2.3-9-ge5ebf34)


You can get the newest version (currently 2.4-9 but 2.5 is coming out in a few days) from the Downloads page of our website (link in the top menu bar).

• Member

Ok thanks, I'll try it.

• Member

ok I tried with the latest version, with -Xmx4g, -Xmx16g and -Xmx32g and always the same error ... any idea ? thanks

• Member

Ok after a lot of verification, I think I had a problem before, during my mapping. Thanks.

• Member

Hii Cecmonat

I am also getting the same memory problem.I tried with -Xmx4g, -Xmx16g still no solution

please let me know how to solve that.

Thanks

• EuropeMember

Hi all,

thanks for GATK - it's been an extremely useful tool and we use it on a daily basis.
But recently i repeatedly run into a problem, which is the same as cecmonat describes.

I have to align whole genome samples. For that i use the GEM aligner (http://www.nature.com/nmeth/journal/v9/n12/abs/nmeth.2221.html).

As suggested in your BestPractices i do the DuplicateMarking on single lanes and afterwards fuse&sort the files using novosort.

The TargetCreator works fine:
Program Args: -nt 8 -T RealignerTargetCreator -R hg19.fasta -I ###.sort.bam -o ###.intervals -known dbindel137_121217.vcf --minReadsAtLocus 6 --maxIntervalSize 200 --downsampling_type NONE 

As soon as i do the Indel realignment i get the error message that i provided too little memory. I tried -Xmx20g, -Xmx35g up to -Xmx95g and still get the same error. As cecmonat said, there might be errors during the mapping. Basically i used the same settings for >500 exome sequences already successfully, so i don't have a clue what could be wrong.

java -Xmx35g -jar GATK -T IndelRealigner -RREF -I $TMPDIR/$NAME.sort.bam -targetIntervals $TMPDIR/$NAME.intervals -o $TMPDIR/$NAME.realigned.bam -known DBINDEL --maxReadsForRealignment 10000 --consensusDeterminationModel USE_SW --downsampling_type NONE  Program Args: -T IndelRealigner -R hg19.fasta -I ###.sort.bam -targetIntervals ###.intervals -o ###.realigned.bam -known dbindel137_121217.vcf --maxReadsForRealignment 10000 --consensusDeterminationModel USE_SW --downsampling_type NONE  ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] <br /> ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining <br /> ReadShardBalancer1 - Loading BAM index data for next contig <br /> ReadShardBalancer\$1 - Done loading BAM index data for next contig <br /> ProgressMeter - Starting 0.00e+00 30.0 s 49.6 w 100.0% 30.0 s 0.0 s <br /> ProgressMeter - Starting 0.00e+00 74.0 s 123.6 w 100.0% 74.0 s 0.0 s <br /> 

#### ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program. See the -Xmx JVM argument to adjust the maximum heap size provided to Java `

I'd be glad if you could point me to my error.

thanks,
Oliver

PS: GATK versions tried: 2.3-9-ge5ebf34, 2.4-9-g532efad, 2.6-4-g3e5ff60

I think the problem in your case is that you're running with USE_SW, which is very memory-intensive, especially at areas of deep coverage. If you really want to go with the SW realignment, you might want to try reducing the max reads for realignment argument.

• EuropeMember

Hi Geraldine,

thanks for the immediate answer and sorry for my delayed response.

I switched USE_SW to KNOWNS_ONLY and subsequently reduced the maxReads to 500. Still i get the error message.

If i run the IndelRealigner on a single lane, it works fine.
Are there any adverse effects i have to expect if i would proceed with single lane Realignment & Recalibration and fuse the lanes just before Variant calling? I know its against the "best practice", but since i don't get any step forward i would be OK already with "good practice"

thanks a lot,
Oliver

Hi Oliver,

That's actually completely fine -- these pre-processing steps should typically be done per lane. Then you merge the lane data per sample, and optionally repeat the dedup & realign steps (but that's not required) before finally calling variants. I'm working on rewriting the best practices doc to make that clearer.

• EuropeMember

Hi Geraldine,

thanks a lot.
So i proceed lane-wise and if the whole sample data still fail i keep going towards SNP and Indel calling. Same read group tags would then be fused to one observation.

Sorry for the misunderstanding.

cheers,
Oliver