# GATK 3.0 HaplotypeCaller RUNTIME ERROR

NJ
edited March 19

I am running Haplotypecaller for ~600 bams to perform gVCF call on a cluster (SGE qsub system). Each node has 32 cores, 256 GB RAM. I am running 8 tasks pernode, so each task has 4 cores and 32GB memories.

It's been running ~30 hours now (a single BAM needs ~5 hours), out of the ~500 finished tasks there are 24 quit with error showing below. I am not sure if it is an error caused by my command or something else. Can someone give any suggestions?

## The command:

java -Xmx32G GATK3.0 \
-T HaplotypeCaller \
-ERC gVCF -L EZ_Exome_v2.bed \
-variant_index_type LINEAR \
-variant_index_parameter 128000 \
-R ucsc.hg19.fasta \
-nct 4 \
--dbsnp dbsnp_138.hg19.vcf \
-I 1.recalibrated.bam \
-o 1.recalibrated.vcf

##### ERROR A GATK RUNTIME ERROR has occurred (version 2014.2-3.1.7-10-g867c2fb):



A thread safety error is a bit of a worry!

Yes it is a concern, that is why we are favoring alternative ways to speed up HC without using multithreading.

Geraldine Van der Auwera, PhD

Cambridge, MA

Just to chime in, I've run into this problem, too, on GATK 3.2-2 calling WGS samples with -nct 2. Some of my samples have run happily without any incident for nearly one day before failing.

Geraldine, is there a current recommendation or best-practice for speeding up HC on large BAMs?

Grace

Hi Grace,

Right now the only recommendation we have is to use Queue to parallelize your HC jobs. And of course, make sure you're using the new workflow to run HC per sample in GVCF mode, followed by joint genotyping.

Geraldine Van der Auwera, PhD

Garvan Institute of Medical Research

Hi Geraldine, We've seen this bug a few times now as well. We're already using Queue, so the jobs do eventually run to completion. so I guess +1 vote for fixing this bug please!

I've just noticed that we're using the latest HaplotypeCaller via Queue and -nt 1 and -nct 4, for exome analysis. Would you say this is overkill?

cheers, Mark

• Posts: 11Member

@drmjc said: Hi Geraldine, We've seen this bug a few times now as well. We're already using Queue, so the jobs do eventually run to completion. so I guess +1 vote for fixing this bug please!

I've just noticed that we're using the latest HaplotypeCaller via Queue and -nt 1 and -nct 4, for exome analysis. Would you say this is overkill?

cheers, Mark

To add to Mark's question, we're using scatterCount = 400 with -nt 1 and -nct 4