The current GATK version is 3.2-2

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

# PairHMM Log Probability cannot be greater than 0 Error with HaplotypeCaller

Posts: 33Member, GSA Collaborator

Dear All,

I've encountered the following error while processing one of the regions from an interval file that I want to re-discover/genotype with the HC. Note that I've processed the other 4.2mln regions without any problems. A quick search on the forum did not lead to any results. Let me know if you'd like more information!

Command:

~/tools/jdk1.7.0_25/bin/java -Xmx8g \
-jar ~/tools/GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-L ~/gonl/projects/SV/ug/gonl.union_pindel_ug_clever.sites.2.vcf.gz \
-L 1:237759920-238000001 \
-nct 6 \
-isr INTERSECTION \
-o ~/results/trio-analysis/hc/1_237759920-238000001.vcf \
-R /target/gpfs2/gcc/resources/hg19/indices/human_g1k_v37.fa \
-I ~/gonl/projects/trio-analysis/resources/bqsr2.bams.list \
-minPruning 5 \
2>&1 | tee /target/gpfs2/gcc/home/lfrancioli/logs/trio-analysis/hc/1_237759920-238000001.out

Error:

##### ERROR ------------------------------------------------------------------------------------------

##### ERROR stack trace

java.lang.IllegalStateException: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.002992 at org.broadinstitute.sting.utils.pairhmm.PairHMM.computeReadLikelihoodGivenHaplotypeLog10(PairHMM.java:131) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LikelihoodCalculationEngine.computeReadLikelihoods(LikelihoodCalculationEngine.java:262) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LikelihoodCalculationEngine.computeReadLikelihoods(LikelihoodCalculationEngine.java:205) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:770) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:140) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:273) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

##### ERROR ------------------------------------------------------------------------------------------
Tagged:

Hi Laurent,

Would you mind sending us a bam snippet of the problem interval so we can debug this locally?

Geraldine Van der Auwera, PhD

• Posts: 3Member

Hi Geraldine, I also met the similar problem like Laurent mentioned. I tried both HaplotypeCaller and UnifiedGenotyper, but both reported the "PairHMM" related error for some regions (have tried both versions: 2.6-4-g3e5ff60 and 2.7-4-g6f46d11). I tried post a separate question last week, but my question didn't get approved. I don't know why. Hope you could help me. Thanks.

Best regards,

Qiongyi

• Posts: 3Member

Followings are error message for one example:

##### ERROR stack trace

java.lang.IllegalStateException: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.054450 at org.broadinstitute.sting.utils.pairhmm.PairHMM.computeReadLikelihoodGivenHaplotypeLog10(PairHMM.java:131) at org.broadinstitute.sting.gatk.walkers.indels.PairHMMIndelErrorModel.computeGeneralReadHaplotypeLikelihoods(PairHMMIndelErrorModel.java:401) at org.broadinstitute.sting.gatk.walkers.indels.PairHMMIndelErrorModel.computeDiploidReadHaplotypeLikelihoods(PairHMMIndelErrorModel.java:215) at org.broadinstitute.sting.gatk.walkers.genotyper.IndelGenotypeLikelihoodsCalculationModel.getLikelihoods(IndelGenotypeLikelihoodsCalculationModel.java:150) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoods(UnifiedGenotyperEngine.java:331) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoodsAndGenotypes(UnifiedGenotyperEngine.java:232) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:367) at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:143) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:98) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

##### ERROR ------------------------------------------------------------------------------------------

It's possible our spam blocking software mistakenly classified your post. I've verified your account so this will no longer happen to you.

What you're experiencing looks like a bug in the PairHMM. We'll need you to upload a bam snippet so that we can debug this locally. Instructions are here: http://www.broadinstitute.org/gatk/guide/article?id=1894

Geraldine Van der Auwera, PhD

• Posts: 3Member
edited November 2013

Thanks for your reply. I tested again for "UnifiedGenotyper" (GenomeAnalysisTK-2.7-4-g6f46d11) for a specific region and the same error could be reported, but "HaplotypeCaller" seems ok (I got error when using HaplotypeCaller in GATK version 2.6-4-g3e5ff60 but 2.7-4 didn't report this error. So far so good for HaplotypeCaller). So this error only happens in "UnifiedGenotyper" in the latest version of GATK.

I've uploaded a bam snippet to your ftp. The file name is "GATK_bug_PairHMM_error.UploadByQiongyi.tar.gz", including 5 files in it:

1) bed file (specific region in which GATK reported error ): PairHMM_error.chr13.bed;

2) a bam snippet file: PairHMM_error.bam;

3) bai file for the bam snippet: PairHMM_error.bai;

4) error_message.txt;

5) my command to reproduce this error: command_to_reproduce_error.txt;

Hope this could help your team to debug. Thanks very much for your help.

Cheers,

Qiongyi

Post edited by qingfengzealot on

Thanks, I'll have a look and see if we can fix this.

Geraldine Van der Auwera, PhD

• GermanyPosts: 6Member

Hi, Geraldine, i have encountered the same error with HaplotypeCaller GenomeAnalysisTK-2.7-4-g6f46d11 and would like to know whether this problem was fixed.

I believe it has been fixed in our development version. Try running with the latest nightly build (see downloads page) and let me know if it works for you.

Geraldine Van der Auwera, PhD

• GermanyPosts: 6Member

It was claimed here http://www.broadinstitute.org/gatk/guide/article?id=2572 for version 2.5: "Fixed problem where our internal PairHMM was generating positive likelihoods."

@bulasha, there can be different issues that cause the same symptom such as PairHMM generating positive likelihoods. One such issue appeared previously and was fixed in version 2.5, then a similar issue reappeared with a different cause, and we believe we have now fixed that as well. The best way to know if our fix applies to your issue is to try the latest nightly build and see if that works. In any case, we are planning to release the next version (2.8) fairly soon.

Geraldine Van der Auwera, PhD

• GermanyPosts: 6Member

Hi, Geraldine, i just tried HaplotypeCaller from the latest nightly build and received the same error ERROR MESSAGE: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 65, 84, 84, 65, 65, 71, 71, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 71, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.020777

I see, sorry to hear that -- can you please upload test files so we can debug locally? Instructions are here: http://www.broadinstitute.org/gatk/guide/article?id=1894

Geraldine Van der Auwera, PhD

• Posts: 6Member
edited December 2013

Dear All, Following the GATK Best Practices I also had the same PairHMM Log Probability cannot be greater than 0 error when running HaplotypeCaller v2.8-1. By progressively narrowing the interval I found the 1bp (a deletion) triggering the error. Then I checked when the error appeared on my pipeline by running HC on the BAM generated at each step, and it appears only after the Base recalibration. Interestingly, running HC on the BAM before Base recalibration the deletion triggering the error is present in the output VCF. If you are interested I can upload the test files.

Best regards, Paolo

Post edited by puva on

Hi Paolo,

Yes, please upload some tests files so we can debug this locally. Thanks!

Geraldine Van der Auwera, PhD

• Posts: 6Member

@Geraldine_VdAuwera I just uploaded pairHmm_bug_puva.tar.gz to your FTP server. It includes a README with code to reproduce the bug and all the input/output files.

Please let me know if something is missing.

Paolo

Thanks Paolo, we'll have a look at it asap and I'll get back to you when we know a little more about what's going on here.

Geraldine Van der Auwera, PhD

• GermanyPosts: 6Member

Dear Geraldine, I have uploaded pairHmm_bug_bulasha.tar.gz to your FTP server. Hope you will be able to reproduce the bug and understand the reason of it. Thank you in advance.

Thanks @bulasha. I think we have what we need in @puva's test data, but it's good to have a second set just in case.

Geraldine Van der Auwera, PhD

Hi everyone,

Sorry it's taken so long, but this was a tricky one. I'm happy to say however we finally have a fix for this; it will be available in the nightly build starting tomorrow, and in the next public release. That said, we're not sure when the next release will be yet, so we're discussing internally whether to do a special bug-fix release for this. Please let us know if this bug is a showstopper for your work and if you can't use the nightly build.

Geraldine Van der Auwera, PhD

• Posts: 6Member
edited February 17

Thanks @Geraldine_VdAuwera, I used the nightly build for running HaplotypeCaller on several datasets and it works fine, great! Waiting for the next public release, I'll continue using the nightly build for HC, and the current release (2.8-1) for the rest of the pipeline.

I would just let you know that UG from the nightly build still has the same problem when calling indels (--genotype_likelihoods_model INDEL or BOTH), with a slightly different error message:

PairHMM Log Probability cannot be greater than 0: haplotype: CATCTTTTTTTTTTTTTTTTTTTAGACGGAG, read: TTTTTTTTTTTTTTTTTTAGAC, result: 0.226316, PairHMM: LoglessPairHMM

Is this a way to suggest the use of HaplotypeCaller for calling indels? Thanks for the great job. Paolo

Post edited by puva on

Hi @puva,

Glad to hear it's working! Be aware though that we don't support mixing and matching steps from different versions -- we recommend running all tools from the same version. In this case the version differences may be small enough that you won't experience any issues, but please keep a critical eye on your results for anything that looks off.

Is this a way to suggest the use of HaplotypeCaller for calling indels?

Hah, it's not done on purpose if that's what you mean :)

But the fact is that we know UG does not perform well on indels, and we are not planning to invest any more resources to improve that because we consider that HC is the future of calling indels.

Geraldine Van der Auwera, PhD