long wait time between workflow finishing and getting final workflow status

indranielindraniel USAMember

I've been running the gatk4-germline-snps-indels workflow on Google Cloud with cromwell. I've noticed for sizeable sample input sets, once a workflow finishes (either successfully or in failure) the status of the workflow remains in a "Running" state for a few hours before it finally updates with either a success or failed status. I get the overall workflow status via the following curl command:

ID="ddbbade2-580b-4a6d-95fb-aedaf2780d35"
curl --silent -X GET --header "Accept: application/json" "http://localhost:8000/api/workflows/v1/${ID}/status" | python -m json.tool

{
    "id": "ddbbade2-580b-4a6d-95fb-aedaf2780d35",
    "status": "Running"
}

However when I inspect the relevant metadata for the workflow:

curl -X GET http://localhost:8000/api/workflows/v2/${ID}/metadata?expandSubWorkflows=false -H "accept: application/json" | python -m json.tool

I see that all the relevant meta json attributes for the workflow are filled up.

Why does it take so long until the final workflow status is updated?

On failure cases, can I re-submit a new workflow before the workflow status finally updates?

Answers

  • indranielindraniel USAMember

    Could these long wait times be related to the cromwell configuration variables?

    I'm looking at the following configuration variables in particular:

    I'm using the default values that the example cromwell configuration file has at the moment.

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev admin

    Hi @indraniel what do you mean by "the relevant meta json attributes for the workflow are filled up"? Cromwell fills in metadata as it goes so just because some fields are filled doesn't imply that the workflow is complete: is it possible that Cromwell has completed some of the tasks but others are still running? That would be my first suspicion since this starts to happen as the input sizes increase.

    Since you're in server mode you can check the current state of the workflow using the timing diagrams (http://localhost:8000/api/workflows/v1/${ID}/timing).

    If that's not the case, maybe could you post the full configuration file (after removing any sensitive info!!) in case we can spot something obvious in there?

Sign In or Register to comment.