IndexOutOfBoundsException in HaplotypeCaller (GATK 3.2)

Hi,

I get an error when using HaplotypeCaller (GATK version 3.2 and latest nightly build) on a specific BAM File:

java.lang.IndexOutOfBoundsException: Index: 28, Size: 6
at java.util.LinkedList.checkElementIndex(Unknown Source)
at java.util.LinkedList.get(Unknown Source)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.mergeDanglingTail(DanglingChainMergingGraph.java:272)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.recoverDanglingTail(DanglingChainMergingGraph.java:184)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.recoverDanglingTails(DanglingChainMergingGraph.java:131)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:202)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:114)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.LocalAssemblyEngine.runLocalAssembly(LocalAssemblyEngine.java:164)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:1022)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:882)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:218)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:273)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:314)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107

It occurs in both normal VCF and GVCF mode:
<br /> java -XX:ParallelGCThreads=1 -Xmx4g -jar GenomeAnalysisTK.jar \<br /> -R hg19.fa -I error.bam \<br /> -L test2.bed \<br /> -T HaplotypeCaller -o test.gatk.ontarget.vcf<br />

<br /> java -XX:ParallelGCThreads=1 -Xmx4g -jar GenomeAnalysisTK.jar \<br /> -R hg19.fa -I error.bam \<br /> -L test2.bed \<br /> -T HaplotypeCaller --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -o test.gatk.ontarget.gvcf<br />

The BAM file was produced according to the best practice guide. I narrowed the error down to 15 reads (see attached files).

Best Regards,
Thomas

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Ok, we have a fix for this, and we're going to make a bug fix release for it since it's a bad one. I'll let you know when it's out; should be later today.

  • Hi,
    I get same error in HaplotypeCaller from GenomeAnalysisTK-3.2-0 build:

    INFO 14:21:43,751 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked successfully from GATK jar file
    INFO 14:21:43,751 VectorLoglessPairHMM - Using vectorized implementation of PairHMM
    INFO 14:22:12,103 ProgressMeter - 4:42640524 0.0 30.0 s 49.6 w 2.4% 21.0 m 20.5 m
    INFO 14:22:42,104 ProgressMeter - 4:48897528 0.0 60.0 s 99.2 w 6.5% 15.4 m 14.4 m
    INFO 14:22:46,476 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IndexOutOfBoundsException: Index: 32, Size: 32
    at java.util.ArrayList.rangeCheck(ArrayList.java:635)
    at java.util.ArrayList.get(ArrayList.java:411)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.mergeDanglingTail(DanglingChainMergingGraph.java:272)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.recoverDanglingTail(DanglingChainMergingGraph.java:184)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.recoverDanglingTails(DanglingChainMergingGraph.java:131)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:202)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:114)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.LocalAssemblyEngine.runLocalAssembly(LocalAssemblyEngine.java:164)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:1022)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:882)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:218)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.2-0-g289df4b):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Index: 32, Size: 32
    ERROR ------------------------------------------------------------------------------------------

    I run the command:

    java -jar /home/agatha/GenomeAnalysisTK-3.2-0/GenomeAnalysisTK.jar \
    -nct 8 \
    -T HaplotypeCaller \
    -R hs37d5.fa \
    -I merged_HSP27.1.sorted.bam.dedup_reads.realigned.recal_reads.bam \
    -I merged_HSP27.6.sorted.bam.dedup_reads.realigned.recal_reads.bam \
    -L 4:39024951-191154276 \
    --genotyping_mode DISCOVERY \
    -stand_emit_conf 10 \
    -stand_call_conf 30 \
    -o HSP27_variants.vcf

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    This bug should be fixed in the patched release from yesterday, version 3.2-2. Please try that and let me know if you still have issues.

  • Hi Geraldine,

    It works for me. Thanks gor this great job.
    Best,

    Agatha

  • cbossucbossu Uppsala UniversityMember

    Hi Geraldine- I'm getting the same error when using HaplotypeCaller while using version v3.2-2-gec30cee. This error pops up after multiple vcf entries are found in my known allele file:

    WARN 18:21:53,633 GenotypingGivenAllelesUtils - Multiple valid VCF records detected in the alleles input file at site scaffold_78:4007, only considering the first record
    INFO 18:22:00,257 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
    etc...

    I was wondering if you've encountered this error before, or the fact there are multiple entries per site (i.e.):
    scaffold_78 4007 . T C 6972.90 . set=DD-J

    scaffold_78 4007 . TC T,CC 414.07 . set=R

    Thanks!
    Christen

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi @cbossu,

    The program deals with the multiple records by just considering the first and ignoring the others, as stated in the warning, so that shouldn't be the problem.

    IndexOutOfBoundsException errors come in many shapes and sizes, so this is probably a different underlying issue than the others in this thread. Can you please post the rest of the stack trace so I can see which subroutine is throwing the error?

Sign In or Register to comment.