We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Haplotype Caller: how to determine java memory requirements?

KatieKatie United StatesMember ✭✭

Hi, I am interested in calling SNPs for a set of 150 bacterial genomes (genome size ~1Mb). I'm attempting to use the HaplotypeCaller and am running into errors with the java memory: There was a failure because you did not provide enough memory to run this program. See the -Xmx JVM argument to adjust the maximum heap size provided to Java.

There is an estimated run time of ~11 days. I have increased the memory to 20g and am limiting the max_alternate_alleles as well as shown below:

java -d64 -Xmx20g -jar $EXECGATK \
-T HaplotypeCaller \
-R $REF \
-stand_call_conf 20 \
-stand_emit_conf 20 \
--sample_ploidy 1 \
--maxNumHaplotypesInPopulation 198 \
--max_alternate_alleles 3 \
-L "gi|15594346|ref|NC_001318.1|" \
-o ${OUTPATH}${BASE}.chr.snps.indels.vcf

Is there a way to call only SNPs as my understanding is that indel calling is memory intensive and I am going to focus on SNPs for this part of my analysis? Or is there another way to make this analysis more efficient?

Thank you!



Sign In or Register to comment.