cromwell error: failed to find index Success on array

Hello,

I'm running a large workflow processing around 500 files through a scatter operation, using JES as the backend on cromwell 28. After a large portion of the workflow completing, cromwell errors out with:

[07/04/2017 18:23:48.036] [cromwell-system-akka.dispatchers.engine-dispatcher-79] [akka://cromwell-system/user/SingleWorkflowRunnerActor/WorkflowManagerActor] WorkflowManagerActor Workflow 368636fd-04d6-4f3b-b472-dc9a00c9f5d0 failed (during ExecutingWorkflowState): Failed to find index Success(WdlInteger(498)) on array: Success([files omitted for brevity....])

498
wdl4s.WdlExpressionException: Failed to find index Success(WdlInteger(498)) on array:

Success([same array omitted...])

498
at wdl4s.expression.ValueEvaluator.evaluate(ValueEvaluator.scala:173)
at wdl4s.WdlExpression$.evaluate(WdlExpression.scala:84)
at wdl4s.WdlExpression.evaluate(WdlExpression.scala:165)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$DynamicDeclarationKey.evaluate(WorkflowExecutionActor.scala:774)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.processRunnableDynamicDeclaration(WorkflowExecutionActor.scala:457)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$21$$anonfun$apply$4.apply(WorkflowExecutionActor.scala:379)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$21$$anonfun$apply$4.apply(WorkflowExecutionActor.scala:370)
at scala.util.Try$.apply(Try.scala:192)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$21.apply(WorkflowExecutionActor.scala:370)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$21.apply(WorkflowExecutionActor.scala:369)
at scala.collection.immutable.List.map(List.scala:277)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.startRunnableScopes(WorkflowExecutionActor.scala:369)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.cromwell$engine$workflow$lifecycle$execution$WorkflowExecutionActor$$handleCheckRunnable(WorkflowExecutionActor.scala:339)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$3.applyOrElse(WorkflowExecutionActor.scala:80)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$3.applyOrElse(WorkflowExecutionActor.scala:79)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at akka.actor.FSM$class.processEvent(FSM.scala:663)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.akka$actor$LoggingFSM$$super$processEvent(WorkflowExecutionActor.scala:33)
at akka.actor.LoggingFSM$class.processEvent(FSM.scala:799)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.processEvent(WorkflowExecutionActor.scala:33)
at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:657)
at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:651)
at akka.actor.Actor$class.aroundReceive(Actor.scala:496)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.aroundReceive(WorkflowExecutionActor.scala:33)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

I'm not quite sure what this means. I have examined several of the tasks that produce the outputs referred to in the Success array, and those jobs appear to have completed normally.

I'd also be remiss if I didn't briefly express some gratitude for Cromwell and the support forum here - thank you for releasing / developing
/ supporting Cromwell!

Thanks,
Josh

Tagged:

Best Answer

Answers

  • RuchiRuchi Member, Broadie, Moderator, Dev

    Hey @jweinstk, would you mind posting the WDL source file? It seems like Cromwell can't find a value in an array. The array that you are scattering over, how is that created? Are that values in that array being populated by parsing a TSV, or something else?

  • Sure, the WDL script I'm using is very similar to this . The array that I'm scattering over is an array of file names, which are read in using read_lines (you may notice this does not happen in the linked wdl file - please ignore that discrepancy. I'm using Array[File] aligned_crams = read_lines(crams_file) )

  • RuchiRuchi Member, Broadie, Moderator, Dev

    Hey @jweinstk

    Based on the error, it seems like somewhere in the WDL, Cromwell is searching for the 498th element of an array and it can't seem to find it. In the WDL you shared, I can see one place that would produce such an error since the array being indexed might not be the same length as the array being scattered over:
    https://github.com/hall-lab/sv-pipeline/blob/master/scripts/SV_Pipeline_Full.wdl#L95

    In your WDL, can you confirm that the various arrays you are indexing, are all the same length?

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev

    Side note: from Cromwell 27, you can remove the ugly sub calls for basename, we now have a basename function!

  • @Ruchi - I can confirm that arrays that I am indexing should be the same length.

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev

    @jweinstk There's an easy way to check whether the index is too large or whether Cromwell's making a mistake... could you confirm whether the array printed in the error (the one you've omitted above) has at least 499 elements, or less that 499 elements?

  • @ChrisL , I parsed out the Success array from the log, and it has 493 elements. I've examined a couple of the shards that are missing, and I don't quite see why they are not included here. For example, shard-14 is missing from the Success array, but all shard-14 tasks appear to have completed successfully. I examined four of the missing elements / shards and all appear to have succeeded after the maximum number of preemptible tries, for what that's worth.

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev

    I believe the files in that Success array you're looking is either the array of the CNVnator_Histogram.output_cn_hist_root output or the Extract_Reads.output_cram_index output from which it's trying to get index 498.

    Could you possibly look at the output arrays from those two tasks and tell me how many items there are in each output?

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev

    So, I just tried to recreate this with the following WDL (and varying the range as high as 1000):

    workflow arrays_are_good {
        Int range = 1000
    
        call mk_ab { input: range = range }
        call mk_c { input: range = range }
    
        scatter(i in range(length(mk_ab.a))) {
            call use_abc { input: a = mk_ab.a[i], b = mk_ab.b[i], c = mk_c.c[i]}
        }
    }
    
    task mk_ab {
        Int range
    
        command {
          for i in `seq 1 ${range}`
          do
            echo -n "a" > a_$i.txt
            echo -n "b" > b_$i.txt
          done
        }
        output {
            Array[File] a = glob("a*")
            Array[File] b = glob("b*")
        }
    }
    
    task mk_c {
        Int range
    
        command {
          for i in `seq 1 ${range}`
          do
            echo -n "c" > c_$i.txt
          done
        }
        output {
            Array[File] c = glob("c*")
        }
    }
    
    task use_abc {
        File a
        File b
        File c
        command {
            cat ${a} ${b} ${c}
        }
        output {
            String o = read_string(stdout())
        }
    }
    

    On Local and JES backends, I never saw the problem you mentioned.

    I did manage to recreate your error message when I changed the mk_c task to use seq 2 ${range}, so that there were fewer c files than a or b files however, so I'd certainly double-check that those outputs I mentioned are the same size as you're expecting.

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev

    Hey @jweinstk I just wanted to give you a heads up that I think I've managed to track down the root cause of your problem. I've raised this as a bug here: https://github.com/broadinstitute/cromwell/issues/2455 - I hope to get a fix for this out ASAP and will follow up here when I do.

    In the mean time, thank you so much for raising the issue!

  • @ChrisL Great - thank you (and the Cromwell team) so much for looking into it!! Looking forward to using the new 28 release.
Sign In or Register to comment.