GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

HaplotypeCaller 2.4

I am getting the following error. What is the minimum read size to do assembly? 50 basepair too short?

ERROR stack trace

java.lang.IllegalStateException: Reads are too small for use in assembly.
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.DeBruijnAssembler.createDeBruijnGraphs(DeBruijnAssembler.java:139)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.DeBruijnAssembler.runLocalAssembly(DeBruijnAssembler.java:123)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:483)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:132)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:552)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:512)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:244)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.ja
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.j
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:1
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:24
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:15
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ---------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-3-g2a7af43):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Reads are too small for use in assembly.
ERROR ---------------------------------------------------------------------------------------

::::::::::::::

Best Answers

Answers

  • andrewseverinandrewseverin Posts: 13Member

    Actually the reads on this sample is 200 bases. It is from a MiSeq run and works fine with version 2.3.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    Ooh, first bug report for 2.4 already! That was fast.

    We'll take a look at it. Could you upload a snippet of your bam where the error occurs? We'll probably need to reproduce this locally to figure out what's going on here.

    Geraldine Van der Auwera, PhD

  • andrewseverinandrewseverin Posts: 13Member

    Let me check a few other things and let this run finish. I will be interested to see if the interval that I am running this on returns any snps/indels or if it exits out.

    Speed has improved through by a factor of 4?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    Sure, no problem. Ryan tells me it's probably a region where lots of reads are getting clipped and resulting in very short leftovers. Before 2.4 the HC would simply have skipped through the region silently, but now it's flipping out unnecessarily. I expect if you look at the interval in question that may confirm his supposition.

    Geraldine Van der Auwera, PhD

  • andrewseverinandrewseverin Posts: 13Member

    It would be nice if the stack trace would give me some idea as to which interval it is flipping out about.

  • andrewseverinandrewseverin Posts: 13Member

    Any Idea what this one means?

    java.lang.ArrayIndexOutOfBoundsException: -1
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.generateVCsFromAlignment(GenotypingEngine.java:675)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.assignGenotypeLikelihoods(GenotypingEngine.java:140)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:500)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:132)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:552)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:512)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:244)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:69)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:100)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
  • andrewseverinandrewseverin Posts: 13Member

    How do I upload?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    Off-by-one error while taking down the variant context info. Sounds like another edge case that's not being handled properly.

    To see which interval HC is choking on, use --debug -- it will print out a lot of info including the interval it's processing.

    See the instructions on how to file a bug report here:
    http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • getvictorgetvictor Posts: 1Member

    For off-by-one error, the following seems to bypass the issue.
    In GenotypingEngine.java:675, replace

    final byte refByte = ref[refPos-1];
    if( BaseUtils.isRegularBase(refByte) ) {
        insertionAlleles.add( Allele.create(refByte, true) );
    }

    with:

    // Only create a ref allele if it exists and is a regular base
    if (refPos > 0) {
        final byte refByte = ref[refPos-1];
        if( BaseUtils.isRegularBase(refByte) ) {
            insertionAlleles.add( Allele.create(refByte, true) );
        }
    }
  • andrewseverinandrewseverin Posts: 13Member

    Is this file in the jar zip file? GenotypingEngine.java
    or do I download a different source version of GATK?
    Also I was using the Haplotype caller is it still this file?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    This issue has been fixed in the latest version of GATK (2.4-7).

    Geraldine Van der Auwera, PhD

  • andrewseverinandrewseverin Posts: 13Member

    hrmm. I am using
    (version 2.4-7-g5e89f01):

    ERROR MESSAGE: -1
    ERROR

    But still get the error. Is this the most recent version?

    I will work on getting an example for you guys. I did not have time in my last attempt unless you tell me this is an old version of 2.4-7

  • andrewseverinandrewseverin Posts: 13Member

    I have uploaded the snippet of the problem with associated scripts that I was using.
    -rw-r--r-- 1 depristo wga 287606741 Mar 15 17:37 upload-minusoneError.zip

  • andrewseverinandrewseverin Posts: 13Member

    Geraldine,
    Did you see I posted the snippet for the error above?
    Thanks

Sign In or Register to comment.