cnv_somatic_panel_workflow.wdl: Status change from - to WaitingForReturnCodeFile is hanging

Currently I try to run the cnv_somatic_panel_workflow.wdl script downloaded with GATK4.0.6.0 and used as is locally on OSX 10.13.6 to process 11 gene panel bam files.
However, after a few seconds the workflow is hanging because Cromwell 33.1 is waiting for a ReturnCodeFile:

return exit code

exit $rc
[2018-07-16 06:30:17,05] [info] BackgroundConfigAsyncJobExecutionActor [cc5ccad2CNVSomaticPanelWorkflow.PreprocessIntervals:NA:1]: job id: 8741
[2018-07-16 06:30:17,06] [info] BackgroundConfigAsyncJobExecutionActor [cc5ccad2CNVSomaticPanelWorkflow.PreprocessIntervals:NA:1]: Status change from - to WaitingForReturnCodeFile

What could be the reason for this ?

Answers

  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    Hey @fholst, a few questions:

    • Is your Cromwell running inside a docker?
    • Are you running a server?
    • Is there any metadata collected for this workflow?
    • Would you mind running an ls on the task directory for call-PreprocessIntervals?

    Thanks!

  • fholstfholst Member
    • Cromwell is running locally
    • In myWorkflow_inputs I define the docker image for GATK: broadinstitute/gatk:4.0.6.0 including a mounted volume with all data files
    • There is a cromwell-workflow-log
    • ls for call-PreprocessIntervals: execution inputs

    Thanks,

    Fred

  • fholstfholst Member

    ls on the execution folder:

    docker_cid script.kill stderr.kill
    rc script.submit stdout.background
    script stderr stdout.kill
    script.background stderr.background

  • fholstfholst Member

    Please let me know if you need any further information to solve this problem.
    Thanks,
    Fred

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev ✭✭

    The Status change from - to WaitingForReturnCodeFile is a pretty normal thing to happen - it means that Cromwell has started running the task and is waiting for the rc file which is produced when the task completes.

    Are you seeing anything to suggest that the task has already completed and that Cromwell is not noticing it?

  • fholstfholst Member

    What I see is that the Status change from - to WaitingForReturnCodeFile is hanging for half a day or longer if I wouldn’t terminate it.
    Except the first time I ran the workflow, when I got this message about one hour after the Status change from - to WaitingForReturnCodeFile :

    [2018-07-15 13:15:43,83] [info] Message [cromwell.docker.DockerHashActor$DockerHashSuccessResponse] from Actor[akka://cromwell-system/user/HealthMonitorDockerHashActor#1523056329] to Actor[akka://cromwell-system/deadLetters] was not delivered. [1] dead letters encountered, no more dead letters will be logged. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

  • fholstfholst Member

    I don’t know if it just needs "a lot of time" to process 11 gene panel seq bam files that are in a mounted local volume.

  • fholstfholst Member

    Now again:

    exit $rc
    [2018-07-17 20:19:37,84] [info] BackgroundConfigAsyncJobExecutionActor [0b1c68d1CNVSomaticPanelWorkflow.PreprocessIntervals:NA:1]: job id: 24360
    [2018-07-17 20:19:37,85] [info] BackgroundConfigAsyncJobExecutionActor [0b1c68d1CNVSomaticPanelWorkflow.PreprocessIntervals:NA:1]: Status change from - to WaitingForReturnCodeFile
    [2018-07-17 21:09:50,71] [info] Message [cromwell.docker.DockerHashActor$DockerHashSuccessResponse] from Actor[akka://cromwell-system/user/HealthMonitorDockerHashActor#1826475188] to Actor[akka://cromwell-system/deadLetters] was not delivered. [1] dead letters encountered, no more dead letters will be logged. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    Hey @fholst ,

    When you see that it's "WaitingForReturnCodeFile" for too long -- do you confirm that your expected outputs are created in the workflow directory? Are you saying that the expected outputs and return code file are generated long before polling completes?

    You may want to disable call caching because it's possible that hashing the outputs is causing a delay. Just to test this theory, you may want to run this workflow with caching disaled to see if it completes faster.

  • fholstfholst Member

    Hi Ruchi,

    The only outputs that I recognize to be created are the files in the local cromwell-execitions subfolder "execution". These files are:

    • docker_cid
    • script
    • script.background
    • script.submit
    • stderr.background
    • stdout.background

    As far as I know call caching is disabled by default in cromwell.

    Thanks,

    Fred

Sign In or Register to comment.