To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Is this error caused by a job submission failure?

mmahmmah Member, Broadie

I am encountering an error with Cromwell v26 running on LSF and SLURM backends in standalone mode. This error is not consistently reproducible, and I believe it may be related to trying to start too many jobs too quickly during a scatter operation and encountering job submission failures. I plan on addressing this with the concurrent job limit configuration, but am looking for information on whether there are other possible issues as well.

This is from Cromwell's standard output:

[ERROR] [05/12/2017 11:52:56.464] [cromwell-system-akka.dispatchers.backend-dispatcher-391] [akka://cromwell-system/user/SingleWorkflowRunnerActor/WorkflowManagerActor/WorkflowActor-6f436d30-39ab-454d-8a98-47f007976161/WorkflowExecutionActor-6f436d30-39ab-454d-8a98-47f007976161/6f436d30-39ab-454d-8a98-47f007976161-EngineJobExecutionActor-ancientDNA_screen.process_sample_hs37d5:58:1/6f436d30-39ab-454d-8a98-47f007976161-BackendJobExecutionActor-6f436d30:ancientDNA_screen.process_sample_hs37d5:58:1/DispatchedConfigAsyncJobExecutionActor] DispatchedConfigAsyncJobExecutionActor [UUID(6f436d30)ancientDNA_screen.process_sample_hs37d5:58:1]: Error attempting to Execute
java.lang.NullPointerException
    at cromwell.backend.standard.StandardAsyncExecutionActor$class.ec(StandardAsyncExecutionActor.scala:695)
    at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.ec(ConfigAsyncJobExecutionActor.scala:121)
    at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.ec(ConfigAsyncJobExecutionActor.scala:121)
    at cromwell.backend.standard.StandardAsyncExecutionActor$class.tellKvJobId(StandardAsyncExecutionActor.scala:682)
    at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.tellKvJobId(ConfigAsyncJobExecutionActor.scala:121)
    at cromwell.backend.standard.StandardAsyncExecutionActor$class.cromwell$backend$standard$StandardAsyncExecutionActor$$executeOrRecoverSuccess(StandardAsyncExecutionActor.scala:532)
    at cromwell.backend.standard.StandardAsyncExecutionActor$$anonfun$executeOrRecover$2.apply(StandardAsyncExecutionActor.scala:521)
    at cromwell.backend.standard.StandardAsyncExecutionActor$$anonfun$executeOrRecover$2.apply(StandardAsyncExecutionActor.scala:521)
    at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
    at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
    at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Best Answer

Answers

  • jgentryjgentry Member, Broadie, Dev

    Hi @mmah - we've definitely seen that error before. Interestingly it happens due to a situation which the creators of the library we're using there says is impossible but clearly not :)

    I don't remember if this was tracked down & resolved or not, I'll investigate. If not I'll open an issue or attach this to an existing one as appropriate.

Sign In or Register to comment.