Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

PairHMM Log Probability cannot be greater than 0 Error with HaplotypeCaller

LaurentLaurent Posts: 33Member, GSA Collaborator

Dear All,

I've encountered the following error while processing one of the regions from an interval file that I want to re-discover/genotype with the HC. Note that I've processed the other 4.2mln regions without any problems. A quick search on the forum did not lead to any results. Let me know if you'd like more information!

Command:

~/tools/jdk1.7.0_25/bin/java -Xmx8g \
-jar ~/tools/GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-L ~/gonl/projects/SV/ug/gonl.union_pindel_ug_clever.sites.2.vcf.gz \
-L 1:237759920-238000001 \
-nct 6 \
-isr INTERSECTION \
-o ~/results/trio-analysis/hc/1_237759920-238000001.vcf \
-R /target/gpfs2/gcc/resources/hg19/indices/human_g1k_v37.fa \
-I ~/gonl/projects/trio-analysis/resources/bqsr2.bams.list \
-XL  ~/gonl/projects/accessibleGenome/results/ALL.accessible.out.mask.intervals \
-minPruning 5 \
2>&1 | tee /target/gpfs2/gcc/home/lfrancioli/logs/trio-analysis/hc/1_237759920-238000001.out

Error:

`##### ERROR ------------------------------------------------------------------------------------------

ERROR stack trace

java.lang.IllegalStateException: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.002992 at org.broadinstitute.sting.utils.pairhmm.PairHMM.computeReadLikelihoodGivenHaplotypeLog10(PairHMM.java:131) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LikelihoodCalculationEngine.computeReadLikelihoods(LikelihoodCalculationEngine.java:262) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LikelihoodCalculationEngine.computeReadLikelihoods(LikelihoodCalculationEngine.java:205) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:770) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:140) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:273) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.002992`
ERROR ------------------------------------------------------------------------------------------

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Hi Laurent,

    Would you mind sending us a bam snippet of the problem interval so we can debug this locally?

    Geraldine Van der Auwera, PhD

  • qingfengzealotqingfengzealot Posts: 7Member

    Hi Geraldine, I also met the similar problem like Laurent mentioned. I tried both HaplotypeCaller and UnifiedGenotyper, but both reported the "PairHMM" related error for some regions (have tried both versions: 2.6-4-g3e5ff60 and 2.7-4-g6f46d11). I tried post a separate question last week, but my question didn't get approved. I don't know why. Hope you could help me. Thanks.

    Best regards,

    Qiongyi

  • qingfengzealotqingfengzealot Posts: 7Member

    Followings are error message for one example:

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IllegalStateException: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.054450 at org.broadinstitute.sting.utils.pairhmm.PairHMM.computeReadLikelihoodGivenHaplotypeLog10(PairHMM.java:131) at org.broadinstitute.sting.gatk.walkers.indels.PairHMMIndelErrorModel.computeGeneralReadHaplotypeLikelihoods(PairHMMIndelErrorModel.java:401) at org.broadinstitute.sting.gatk.walkers.indels.PairHMMIndelErrorModel.computeDiploidReadHaplotypeLikelihoods(PairHMMIndelErrorModel.java:215) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.getLikelihoods(IndelGenotypeLikelihoodsCalculationModel.java:150) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoods(UnifiedGenotyperEngine.java:331) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoodsAndGenotypes(UnifiedGenotyperEngine.java:232) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:367) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:143) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:98) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.7-4-g6f46d11):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.054450
    ERROR ------------------------------------------------------------------------------------------
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Hi @qingfengzealot,

    It's possible our spam blocking software mistakenly classified your post. I've verified your account so this will no longer happen to you.

    What you're experiencing looks like a bug in the PairHMM. We'll need you to upload a bam snippet so that we can debug this locally. Instructions are here: http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • qingfengzealotqingfengzealot Posts: 7Member
    edited November 2013

    Hi @Geraldine_VdAuwera

    Thanks for your reply. I tested again for "UnifiedGenotyper" (GenomeAnalysisTK-2.7-4-g6f46d11) for a specific region and the same error could be reported, but "HaplotypeCaller" seems ok (I got error when using HaplotypeCaller in GATK version 2.6-4-g3e5ff60 but 2.7-4 didn't report this error. So far so good for HaplotypeCaller). So this error only happens in "UnifiedGenotyper" in the latest version of GATK.

    I've uploaded a bam snippet to your ftp. The file name is "GATK_bug_PairHMM_error.UploadByQiongyi.tar.gz", including 5 files in it:

    1) bed file (specific region in which GATK reported error ): PairHMM_error.chr13.bed;

    2) a bam snippet file: PairHMM_error.bam;

    3) bai file for the bam snippet: PairHMM_error.bai;

    4) error_message.txt;

    5) my command to reproduce this error: command_to_reproduce_error.txt;

    Hope this could help your team to debug. Thanks very much for your help.

    Cheers,

    Qiongyi

    Post edited by qingfengzealot on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Thanks, I'll have a look and see if we can fix this.

    Geraldine Van der Auwera, PhD

  • bulashabulasha GermanyPosts: 6Member

    Hi, Geraldine, i have encountered the same error with HaplotypeCaller GenomeAnalysisTK-2.7-4-g6f46d11 and would like to know whether this problem was fixed.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    I believe it has been fixed in our development version. Try running with the latest nightly build (see downloads page) and let me know if it works for you.

    Geraldine Van der Auwera, PhD

  • bulashabulasha GermanyPosts: 6Member

    It was claimed here http://www.broadinstitute.org/gatk/guide/article?id=2572 for version 2.5: "Fixed problem where our internal PairHMM was generating positive likelihoods."

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    @bulasha, there can be different issues that cause the same symptom such as PairHMM generating positive likelihoods. One such issue appeared previously and was fixed in version 2.5, then a similar issue reappeared with a different cause, and we believe we have now fixed that as well. The best way to know if our fix applies to your issue is to try the latest nightly build and see if that works. In any case, we are planning to release the next version (2.8) fairly soon.

    Geraldine Van der Auwera, PhD

  • bulashabulasha GermanyPosts: 6Member

    Hi, Geraldine, i just tried HaplotypeCaller from the latest nightly build and received the same error ERROR MESSAGE: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 65, 84, 84, 65, 65, 71, 71, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 71, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.020777

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    I see, sorry to hear that -- can you please upload test files so we can debug locally? Instructions are here: http://www.broadinstitute.org/gatk/guide/article?id=1894

    Geraldine Van der Auwera, PhD

  • puvapuva Posts: 6Member
    edited December 2013

    Dear All, Following the GATK Best Practices I also had the same PairHMM Log Probability cannot be greater than 0 error when running HaplotypeCaller v2.8-1. By progressively narrowing the interval I found the 1bp (a deletion) triggering the error. Then I checked when the error appeared on my pipeline by running HC on the BAM generated at each step, and it appears only after the Base recalibration. Interestingly, running HC on the BAM before Base recalibration the deletion triggering the error is present in the output VCF. If you are interested I can upload the test files.

    Best regards, Paolo

    Post edited by puva on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Hi Paolo,

    Yes, please upload some tests files so we can debug this locally. Thanks!

    Geraldine Van der Auwera, PhD

  • puvapuva Posts: 6Member

    @Geraldine_VdAuwera I just uploaded pairHmm_bug_puva.tar.gz to your FTP server. It includes a README with code to reproduce the bug and all the input/output files.

    Please let me know if something is missing.

    Paolo

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Thanks Paolo, we'll have a look at it asap and I'll get back to you when we know a little more about what's going on here.

    Geraldine Van der Auwera, PhD

  • bulashabulasha GermanyPosts: 6Member

    Dear Geraldine, I have uploaded pairHmm_bug_bulasha.tar.gz to your FTP server. Hope you will be able to reproduce the bug and understand the reason of it. Thank you in advance.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Thanks @bulasha. I think we have what we need in @puva's test data, but it's good to have a second set just in case.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Hi everyone,

    Sorry it's taken so long, but this was a tricky one. I'm happy to say however we finally have a fix for this; it will be available in the nightly build starting tomorrow, and in the next public release. That said, we're not sure when the next release will be yet, so we're discussing internally whether to do a special bug-fix release for this. Please let us know if this bug is a showstopper for your work and if you can't use the nightly build.

    Geraldine Van der Auwera, PhD

  • puvapuva Posts: 6Member
    edited February 17

    Thanks @Geraldine_VdAuwera, I used the nightly build for running HaplotypeCaller on several datasets and it works fine, great! Waiting for the next public release, I'll continue using the nightly build for HC, and the current release (2.8-1) for the rest of the pipeline.

    I would just let you know that UG from the nightly build still has the same problem when calling indels (--genotype_likelihoods_model INDEL or BOTH), with a slightly different error message:

    PairHMM Log Probability cannot be greater than 0: haplotype: CATCTTTTTTTTTTTTTTTTTTTAGACGGAG, read: TTTTTTTTTTTTTTTTTTAGAC, result: 0.226316, PairHMM: LoglessPairHMM

    Is this a way to suggest the use of HaplotypeCaller for calling indels? Thanks for the great job. Paolo

    Post edited by puva on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,407Administrator, GATK Developer admin

    Hi @puva,

    Glad to hear it's working! Be aware though that we don't support mixing and matching steps from different versions -- we recommend running all tools from the same version. In this case the version differences may be small enough that you won't experience any issues, but please keep a critical eye on your results for anything that looks off.

    Is this a way to suggest the use of HaplotypeCaller for calling indels?

    Hah, it's not done on purpose if that's what you mean :)

    But the fact is that we know UG does not perform well on indels, and we are not planning to invest any more resources to improve that because we consider that HC is the future of calling indels.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.