HaplotypeCaller (version 3.1-1-g07a4bf8) java.lang.NullPointerException

Dear GATK team,
I can find similar but not quite identical errors in the forum. Your help would be appreciated.
Cheers,
Mark

Details follow:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeDiploidHaplotypeLikelihoods(PairHMMLikelihoodCalculationEngine.java:443)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeDiploidHaplotypeLikelihoods(PairHMMLikelihoodCalculationEngine.java:417)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.calculateGLsForThisEvent(GenotypingEngine.java:385)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.assignGenotypeLikelihoods(GenotypingEngine.java:222)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:880)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:141)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704)
at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Program version:
version 3.1-1-g07a4bf8

Command line:
java -Xmx12g -Djava.io.tmpdir=$tmpDir -jar $GATKPATH/GenomeAnalysisTK.jar \ -I $WORKDIR/$OUTPREFIX.realigned.recal.sorted.bwa.$BUILD.bam \ -R $GATKREFPATH/$GATKINDEX \ -T HaplotypeCaller \ --dbsnp $GATKREFPATH/dbsnp_137.hg19.vcf \ --min_base_quality_score 20 \ --emitRefConfidence GVCF \ --variant_index_type LINEAR \ --variant_index_parameter 128000 \ -o $gVcfFolder/$OUTPREFIX.snps.gvcf \ -nct 8 >> $WORKDIR/$OUTPREFIX.pipeline.log 2>&1

INFO 00:45:18,861 HelpFormatter - --------------------------------------------------------------------------------
INFO 00:45:18,864 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.1-1-g07a4bf8, Compiled 2014/03/18 06:09:21
INFO 00:45:18,865 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 00:45:18,865 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 00:45:18,872 HelpFormatter - Program Args: -I /scratch/$USER/BWA-GATK-20140327/23350.realigned.recal.sorted.bwa.hg19_1stM_unmask_ran_all.bam -R /home/users/$USER/RefSeq/GATK/hg19_1stM_unmask_ran_all.fa -T HaplotypeCaller --dbsnp /home/users/$USER/RefSeq/GATK/dbsnp_137.hg19.vcf --min_base_quality_score 20 --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -o /home/users/$USER/DATA/gVcfFileLibrary/23350.snps.gvcf -nct 8
INFO 00:45:18,877 HelpFormatter - Executing as [email protected]$MACHINE on Linux 2.6.32-279.2.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_09-b05.
INFO 00:45:18,878 HelpFormatter - Date/Time: 2014/03/29 00:45:18
INFO 00:45:18,879 HelpFormatter - --------------------------------------------------------------------------------
INFO 00:45:18,879 HelpFormatter - --------------------------------------------------------------------------------

Prior to the error:
INFO 07:47:17,104 ProgressMeter - chr10:42388309 1.68e+09 7.0 h 15.0 s 55.5% 12.7 h 5.6 h
INFO 07:47:29,371 GATKRunReport - Uploaded run statistics report to AWS S3

Best Answer

Answers

  • Dear Geraldine,
    Thanks for your helpful suggestions and apologies for the slow turn around (waiting in line for high performance computer (HPC) access).

    ValidateSAM = no major errors.

    I had a few other "not enough memory errors" when running similar jobs concurrent with this one which made me suspicious. I reran this one just by itself, increased the memory allocation from 12g to 16g and decreased the number of threads from 8 to 4. The error magically disappears. I suspect this is not a specific problem with GATK and more to do with how I'm running it on our local HPC, possibly where I'm directing the temporary files to (unfortunately a huge ignorance on the inner complexities of HPCs prevents a better explanation).

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    No problem -- glad to hear the problem goes away by itself when more memory is available!

Sign In or Register to comment.