We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

HaplotypeCaller runtime error with --emitRefConfidence GVCF

I am trying to run the HaplotypeCaller on a bam file with multiple samples. It runs successfully without the ERC GVCF option, e.g.

java -jar /home/unix/csmillie/bin/GenomeAnalysisTK.jar -T HaplotypeCaller -R ref.fasta -I test.bam

But when I try running it with the ERC GVCF option, I get an error:

java -jar /home/unix/csmillie/bin/GenomeAnalysisTK.jar -T HaplotypeCaller -R ref.fasta -I test.bam --emitRefConfidence GVCF --sample_name TCGGCTGAGAAC

I am using Java 1.7. I have validated the bam file with Picard. The bam file has the appropriate header, with tab-separated read groups that look like this:

The stack trace is below. If anyone can help I would really appreciate it! I am running this on an interactive node on the Broad cluster, in case it helps with debugging. Thanks!

hw-uger-1001:~/data/csmillie/test $ java -jar /home/unix/csmillie/bin/GenomeAnalysisTK.jar -T HaplotypeCaller -R ref.fasta -I test.bam --emitRefConfidence GVCF --sample_name TCGGCTGAGAAC
INFO 09:56:10,853 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:56:10,855 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56
INFO 09:56:10,855 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 09:56:10,855 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 09:56:10,859 HelpFormatter - Program Args: -T HaplotypeCaller -R ref.fasta -I test.bam --emitRefConfidence GVCF --sample_name TCGGCTGAGAAC
INFO 09:56:10,877 HelpFormatter - Executing as [email protected] on Linux 2.6.32-573.12.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_71-b14.
INFO 09:56:10,877 HelpFormatter - Date/Time: 2016/02/08 09:56:10
INFO 09:56:10,878 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:56:10,878 HelpFormatter - --------------------------------------------------------------------------------
INFO 09:56:11,500 GenomeAnalysisEngine - Strictness is SILENT
INFO 09:56:12,598 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500
INFO 09:56:12,606 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 09:56:12,760 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.15
INFO 09:56:12,972 HCMappingQualityFilter - Filtering out reads with MAPQ < 20
INFO 09:56:13,128 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 09:56:13,732 GenomeAnalysisEngine - Done preparing for traversal
INFO 09:56:13,733 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 09:56:13,734 ProgressMeter - Location | active regions | elapsed | active regions | completed | runtime | runtime
INFO 09:56:13,734 HaplotypeCaller - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
INFO 09:56:13,735 HaplotypeCaller - All sites annotated with PLs forced to true for reference-model confidence output
INFO 09:56:14,806 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.isGVCF(HaplotypeCaller.java:1251)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.initializeReferenceConfidenceModel(HaplotypeCaller.java:728)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.initialize(HaplotypeCaller.java:659)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Best Answer


  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    It looks like -sn might not be working. Can you confirm this is the case by running HaplotypeCaller on a single sample bam file? You can use PrintReads to get a single sample bam file from the one you are working with.


  • csmilliecsmillie broadMember

    Wow that fixed it! Thanks Geraldine. I spent forever trying to debug, but didn't think about the output option

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    The simplest things are sometimes the hardest to catch :)

Sign In or Register to comment.