BQSRGatherer exception

Johan_DahlbergJohan_Dahlberg Posts: 85Member ✭✭✭

When using queue for BQSR scatter/gather parellelism I've been seeing the following:

java.lang.IllegalArgumentException: Table1 188,3 not equal to 189,3
        at org.broadinstitute.sting.utils.recalibration.RecalUtils.combineTables(RecalUtils.java:808)
        at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.combine(RecalibrationReport.java:147)
        at org.broadinstitute.sting.gatk.walkers.bqsr.BQSRGatherer.gather(BQSRGatherer.java:86)
        at org.broadinstitute.sting.queue.function.scattergather.GathererFunction.run(GathererFunction.scala:42)
        at org.broadinstitute.sting.queue.engine.InProcessRunner.start(InProcessRunner.scala:53)
        at org.broadinstitute.sting.queue.engine.FunctionEdge.start(FunctionEdge.scala:84)
        at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:434)
        at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:156)
        at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:171)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
        at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62)
        at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)

I'm using gatk version: v2.4-7-g5e89f01 (I can't keep up the pace with you guys). I'm wondering if this is a know issue, and if so, if there is a workaround or a fix in later GATK versions.

Cheers, Johan

Best Answer

Answers

  • amberlinamberlin Broad InstitutePosts: 6Member

    Are you sure this issue has been fixed? I am running version The Genome Analysis Toolkit (GATK) v2.7-2-g6bda569, Compiled 2013/10/03 11:00:28 but get a similar error when I try to run BQSR with scatter/gather I am using Queue v2.7.2. I get the same error for all four of my samples.

    Thanks Aaron

    java.lang.IllegalArgumentException: Table1 12,3 not equal to 3,3 at org.broadinstitute.sting.utils.recalibration.RecalUtils.combineTables(RecalUtils.java:1012) at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.combine(RecalibrationReport.java:147) at org.broadinstitute.sting.gatk.walkers.bqsr.BQSRGatherer.gather(BQSRGatherer.java:88) at org.broadinstitute.sting.queue.function.scattergather.GathererFunction.run(GathererFunction.scala:42) at org.broadinstitute.sting.queue.engine.InProcessRunner.start(InProcessRunner.scala:53) at org.broadinstitute.sting.queue.engine.FunctionEdge.start(FunctionEdge.scala:84) at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:434) at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:156) at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:171) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62) at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)

  • amberlinamberlin Broad InstitutePosts: 6Member

    As a quick update the error appears to be dependent on the number of ScatterJobs. I get the error when I run with 30 jobs but no error when I run with 24 jobs.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @amberlin,

    We're going to need to debug this locally. Could you please upload the scattered files from the 30 job run?

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 53Member ✭✭

    is there any update on this error? I am having the same error with version Queue-2.7-4-g6f46d11 thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @flescai, based on my records the new bug case was fixed also. Can you tell me a little more about your issue?

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 53Member ✭✭

    Hi @Geraldine_VdAuwera, I was running version Queue-2.7-4-g6f46d11 on an SGE cluster, an run into this error. Looked up in the forum and found this thread. This is a transcript of the output (I suppressed for clarity the list of the files, all .pre_recal.table)

            ERROR 04:51:43,008 FunctionEdge - Error: BQSRGatherer: List([suppressed].pre_recal.table) 
            java.lang.IllegalArgumentException: Table1 2,3 not equal to 1,3
                at org.broadinstitute.sting.utils.recalibration.RecalUtils.combineTables(RecalUtils.java:1012)
                at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.combine(RecalibrationReport.java:147)
                at org.broadinstitute.sting.gatk.walkers.bqsr.BQSRGatherer.gather(BQSRGatherer.java:88)
                at org.broadinstitute.sting.queue.function.scattergather.GathererFunction.run(GathererFunction.scala:42)
                at org.broadinstitute.sting.queue.engine.InProcessRunner.start(InProcessRunner.scala:53)
                at org.broadinstitute.sting.queue.engine.FunctionEdge.start(FunctionEdge.scala:84)
                at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:434)
                at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:156)
                at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:171)
                at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
                at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
                at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62)
                at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
    

    The version I'm using is the standard download, and not a locally compiled one. Looking at some discussione here I reduced the scatter/gather to 20 adding -sg 20 to the queue commands, and the problem disappeared. If you need additional information, please don't hesitate.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hmm, it's good to know that high scatter counts are still leading to bugs. We'll look into it. Thanks for reporting this!

    Geraldine Van der Auwera, PhD

  • jpittjpitt Posts: 7Member

    Hi @Geraldine_VdAuwera, I was having the same issue as @flescai with Queue 2.7.4. We use the BQSRgatherer class to do scatter-gather in our own workflow management system rather than Queue itself. As amberlin pointed out, we've noticed that the error does not occur when merging 20 recalibration tables at a time, but it will when merging recalibration tables for all contigs in the human genome at once.

    For the heck of it I tried out the latest Queue release (2.8). Merging of the grp tables now completes successfully without throwing errors similar to those above. Looking at the GATK release notes, I saw you made the comment "historically we haven't included changes to Queue in the GATK release notes, but we agreed today in group meeting that it would be a good idea to start doing so going forward."

    Because I want to be sure, was this bug explicitly fixed in the Queue 2.8 release? If so, that's great to hear!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @jpitt,

    Yes, the bug was fixed in the 2.8 release. The comment I made about explicitly listing changes to Queue in the release notes will apply starting with the next release. Hopefully it will save users such as yourself some unnecessary trial-and-error :)

    Geraldine Van der Auwera, PhD

  • myourshawmyourshaw Posts: 6Member
    edited March 14

    Not fixed in 3.0-0? I'm still getting an error like this with scatterCount = 86 on a locally compiled version of GATK/Queue (3.0-0-g8fedaf5) from gatk-protected running on GridEngine. I don't have the issue with scatterCount 86 on any module but BQSR. And if I limit the BQSR scatterCount to 20, as suggested above, I do not get an error.

    I reported this previously at gatkforums.broadinstitute.org/discussion/comment/6694/#Comment_6694

    BQSRGatherer: List(/scratch1/tmp/myourshaw/fhf_20140217/sg_dir/.qlog/scratch1/tmp/myourshaw/fhf_20140217/bams/sample_bams/FHF101.sample.recal_data.first_pass.table.baseRecalibrator-sg/temp_01_of_86/FHF101.sample.recal_data.first_pass.table, [scatter files 02-85], /scratch1/tmp/myourshaw/fhf_20140217/sg_dir/.qlog/scratch1/tmp/myourshaw/fhf_20140217/bams/sample_bams/FHF101.sample.recal_data.first_pass.table.baseRecalibrator-sg/temp_86_of_86/FHF101.sample.recal_data.first_pass.table) > List(/scratch1/tmp/myourshaw/fhf_20140217/bams/sample_bams/FHF101.sample.recal_data.first_pass.table)
    java.lang.IllegalArgumentException: Table1 2,3 not equal to 1,3
        at org.broadinstitute.sting.utils.recalibration.RecalUtils.combineTables(RecalUtils.java:1010)
        at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.combine(RecalibrationReport.java:147)
        at org.broadinstitute.sting.gatk.walkers.bqsr.BQSRGatherer.gather(BQSRGatherer.java:88)
        at org.broadinstitute.sting.queue.function.scattergather.GathererFunction.run(GathererFunction.scala:42)
        at org.broadinstitute.sting.queue.engine.InProcessRunner.start(InProcessRunner.scala:53)
        at org.broadinstitute.sting.queue.engine.FunctionEdge.start(FunctionEdge.scala:84)
        at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:434)
        at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:156)
        at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:171)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
        at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62)
        at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
    Post edited by myourshaw on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    @myourshaw‌, sorry for the late response. Can you please check if this still happens with the latest Queue jar? (not locally compiled, but downloaded)

    Geraldine Van der Auwera, PhD

  • userNameuserName Posts: 11Member

    Hi Geraldine, I have the same problem:

        ERROR 00:51:15,166 FunctionEdge - Error: BQSRGatherer: List(/work/queue/scatterGather/...
       java.lang.IllegalArgumentException: Table1 288,3 not equal to 184,3
            at org.broadinstitute.sting.utils.recalibration.RecalUtils.combineTables(RecalUtils.java:1010)
            at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.combine(RecalibrationReport.java:147)
            at org.broadinstitute.sting.gatk.walkers.bqsr.BQSRGatherer.gather(BQSRGatherer.java:88)
            at org.broadinstitute.sting.queue.function.scattergather.GathererFunction.run(GathererFunction.scala:42)
            at org.broadinstitute.sting.queue.engine.InProcessRunner.start(InProcessRunner.scala:53)
            at org.broadinstitute.sting.queue.engine.FunctionEdge.start(FunctionEdge.scala:84)
            at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:434)
            at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:156)
            at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:171)
            at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
            at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
            at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62)
            at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
    

    I get the same error with Queue locally compiled and with Queue jar. I am using the latest version, 3.1-1 in both. I am working with around 50GB of data. I start to run BaseRecalibrarion and after around 12h I get the error. Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @userName‌,

    Can you tell me what value of scatter you are using?

    Geraldine Van der Auwera, PhD

  • userNameuserName Posts: 11Member
    edited March 28

    Hi Geraldine,
    First I did the test with 20 and then 8, with both I get the error.
    If I test it with 1, seems works.

    Post edited by userName on
  • userNameuserName Posts: 11Member

    Sorry @Geraldine_VdAuwera‌ , what do you think about it?
    It can be a problem with my batch of data?
    Thank you

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    @userName, I'm not sure what's going on here. You can check if your data is ok by running Picard ValidateSAMFile on the BAM files. I will ask the dev team if they have any ideas about what might be wrong here.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    @userName, we've had some reports of this same issue happening internally as well, so it seems to be a bug and not an issue with your data.

    Geraldine Van der Auwera, PhD

  • jpittjpitt Posts: 7Member

    @Geraldine_VdAuwera Any update on if this bug is fixed, or a potential estimate of when it will be fixed? I keep thinking the issue is resolved, and it comes back to haunt me... :( I'm sure like everyone else, I really want to keep BQSR in our workflow (or to avoid using a different grp for each contig), but I'm not sure what to do at this stage. I can provide some grps that failed to merge properly if that'd be helpful.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Ah, yes, sorry for failing to update this thread, @jpitt. We put in a fix for this on April 10 (ish) so you can use a recent nightly build to bypass this issue.

    Geraldine Van der Auwera, PhD

  • jpittjpitt Posts: 7Member

    Hi @Geraldine_VdAuwera,

    Cool, thanks. I'll give it a shot. One question, the exception came from the BQSRgatherer class in Queue correct? When looking at and downloading the nightly builds I'm only seeing GATK. In a thread from last summer, you mentioned nightly builds of Queue do not occur. Are nightly builds for Queue now available, and if so could you help direct me to them? If not, is there some other way i can get my hands on this bug fix? Sorry if I'm just not looking in the right place!

    Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Oh darn, you're right of course -- the nightlies don't include Queue, so this won't help you. Let me see what I can do -- maybe we can generate a patch that you can apply to a clone of the repo, if that's something you'd be comfortable doing.

    Geraldine Van der Auwera, PhD

  • jpittjpitt Posts: 7Member

    @Geraldine_VdAuwera Ah ok, I thought so. Sure, I could try applying the patch if you can provide it. Also, I presume the fix will be in the next release? Any estimate on when that would be? I'm happy to try the patch in the meantime to keep our analyses moving, but for my own sanity I'd prefer to switch back to the stable build when it's released.

    Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @jpitt‌, it looks like generating a patch may be a little tricky in this case, and it may be simpler if we just issue you a snapshot of the development build in which this problem is fixed. I'll generate the package and post a link here when it's available.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin
  • jpittjpitt Posts: 7Member

    Hi @Geraldine_VdAuwera,

    Thanks so much! I was able to get the snapshot, and things are working well. All three of my test cases that were failing with the current version of Queue (3.1-1 I believe?) completed without incident using this snapshot. I'm definitely aware you went out of your way to provide this, so I really appreciate it! I'll be sure to let your team know if I see issues.

    Thanks again!

Sign In or Register to comment.