The current GATK version is 3.3-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# RealignerTargetCreator issue : not provide enough memory to run this program

Posts: 5Member

Hi,

I want to run RealignerTargetCreator with this command line :

qsub -b Y -N RTC -q bigmem.q "/usr/local/java/latest/bin/java -Xmx36g -jar /home/sabotf/sources/GenomeAnalysisTK/GenomeAnalysisTK.jar -T RealignerTargetCreator -R /data/projects/assembling-glab/PacBio_test/XL/filtered_subreads_XL.fasta -o /data/projects/assembling-glab/mappingResults/Tog5681Clean_vs_CG14_XL/output.intervals -I /data/projects/assembling-glab/mappingResults/Tog5681Clean_vs_CG14_XL/rmdup.bam"
but this return this error :

##### ERROR ------------------------------------------------------------------------------------------

##### ERROR ------------------------------------------------------------------------------------------

I tried with -Xmx4g, then,-Xmx12g, then -Xmx48g, and always the same error.
I don't know what to do ...
any idea ?
thanks

Tagged:

Hmm, this should run without needing quite that much memory. Can you try with the latest version?

Also, what is the size of your dataset? Is it very large?

Geraldine Van der Auwera, PhD

• Posts: 5Member

yeah that's what I think, because I already used this tool for bigger data, and it worked with -Xmx4g only ! Sorry how can I know what version I use and where can I recover the latest one ? thanks

The version is stated in the console at the beginning of each run and then again in the error message if there is one, as here:

ERROR A USER ERROR has occurred (version 2.3-9-ge5ebf34)


You can get the newest version (currently 2.4-9 but 2.5 is coming out in a few days) from the Downloads page of our website (link in the top menu bar).

Geraldine Van der Auwera, PhD

• Posts: 5Member

Ok thanks, I'll try it.

• Posts: 5Member

ok I tried with the latest version, with -Xmx4g, -Xmx16g and -Xmx32g and always the same error ... any idea ? thanks

• Posts: 5Member

Ok after a lot of verification, I think I had a problem before, during my mapping. Thanks.

• Posts: 10Member

Hii Cecmonat

I am also getting the same memory problem.I tried with -Xmx4g, -Xmx16g still no solution

please let me know how to solve that.

Thanks

• Posts: 6Member

Hi all,

thanks for GATK - it's been an extremely useful tool and we use it on a daily basis.
But recently i repeatedly run into a problem, which is the same as cecmonat describes.

I have to align whole genome samples. For that i use the GEM aligner (http://www.nature.com/nmeth/journal/v9/n12/abs/nmeth.2221.html).

As suggested in your BestPractices i do the DuplicateMarking on single lanes and afterwards fuse&sort the files using novosort.

The TargetCreator works fine:
Program Args: -nt 8 -T RealignerTargetCreator -R hg19.fasta -I ###.sort.bam -o ###.intervals -known dbindel137_121217.vcf --minReadsAtLocus 6 --maxIntervalSize 200 --downsampling_type NONE

As soon as i do the Indel realignment i get the error message that i provided too little memory. I tried -Xmx20g, -Xmx35g up to -Xmx95g and still get the same error. As cecmonat said, there might be errors during the mapping. Basically i used the same settings for >500 exome sequences already successfully, so i don't have a clue what could be wrong.

java -Xmx35g -jar GATK -T IndelRealigner -RREF -I $TMPDIR/$NAME.sort.bam -targetIntervals $TMPDIR/$NAME.intervals -o $TMPDIR/$NAME.realigned.bam -known DBINDEL --maxReadsForRealignment 10000 --consensusDeterminationModel USE_SW --downsampling_type NONE  Program Args: -T IndelRealigner -R hg19.fasta -I ###.sort.bam -targetIntervals ###.intervals -o ###.realigned.bam -known dbindel137_121217.vcf --maxReadsForRealignment 10000 --consensusDeterminationModel USE_SW --downsampling_type NONE  ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining ReadShardBalancer1 - Loading BAM index data for next contig ReadShardBalancer\$1 - Done loading BAM index data for next contig ProgressMeter - Starting 0.00e+00 30.0 s 49.6 w 100.0% 30.0 s 0.0 s ProgressMeter - Starting 0.00e+00 74.0 s 123.6 w 100.0% 74.0 s 0.0 s 

#### ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program. See the -Xmx JVM argument to adjust the maximum heap size provided to Java

I'd be glad if you could point me to my error.

thanks,
Oliver

PS: GATK versions tried: 2.3-9-ge5ebf34, 2.4-9-g532efad, 2.6-4-g3e5ff60

I think the problem in your case is that you're running with USE_SW, which is very memory-intensive, especially at areas of deep coverage. If you really want to go with the SW realignment, you might want to try reducing the max reads for realignment argument.

Geraldine Van der Auwera, PhD

• Posts: 6Member

Hi Geraldine,

thanks for the immediate answer and sorry for my delayed response.

I switched USE_SW to KNOWNS_ONLY and subsequently reduced the maxReads to 500. Still i get the error message.

If i run the IndelRealigner on a single lane, it works fine.
Are there any adverse effects i have to expect if i would proceed with single lane Realignment & Recalibration and fuse the lanes just before Variant calling? I know its against the "best practice", but since i don't get any step forward i would be OK already with "good practice"

thanks a lot,
Oliver

Hi Oliver,

That's actually completely fine -- these pre-processing steps should typically be done per lane. Then you merge the lane data per sample, and optionally repeat the dedup & realign steps (but that's not required) before finally calling variants. I'm working on rewriting the best practices doc to make that clearer.

Geraldine Van der Auwera, PhD

• Posts: 6Member

Hi Geraldine,

thanks a lot.
So i proceed lane-wise and if the whole sample data still fail i keep going towards SNP and Indel calling. Same read group tags would then be fused to one observation.

Sorry for the misunderstanding.

cheers,
Oliver