We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Call caching not working reliably.

I ran a multi-task workflow on a sample pair yesterday. The workflow failed due to the Cromwell 34 upgrade. Several of the tasks in that workflow, however, had succeeded prior to the upgrade-induced failure. Today I relaunched the workflow, expecting hits in call caching for ALL those workflow tasks that successfully completed. That has not been the case. While one task got a call caching hit, most have not (the workflow is still running, but all but one of the completed tasks are reporting cache misses). An important use case of call caching is to be able to re-run a failed workflow and have Firecloud effectively pick up from that failure. This use case appears to be broken. Does call caching not carry over across Cromwell upgrades?
Best Answers
-
thibault Broad Institute admin
We have found the problem. Cromwell 34 will not recognize call caches from previous Cromwells, due to a misconfiguration on our end. We hope to release a fix this afternoon.
Thank you for your patience.
-
Tiffany_at_Broad Cambridge, MA admin
@birger @lelagina @ruslanafrazer - we released a hotfix for this last night. Please let us know if you are seeing any more issues.
-
lelagina ✭
Hello Tiffany,
I apologize but I think these last two issues are related to our docker image being updated.
Sorry about that.
Answers
Call caching should carry over across Cromwell upgrades, though there are certain situations where a task may cache miss. It can be as simple as needing to wait for the status to update (sometime the UI displays "miss" until a hit is found) or something much more complex.
Could you share your workspace with
[email protected]
and give any submission and workflow ID's needed to take a look at these cache-hits/cache-misses?I've been seeing the same issue with my own workflows.
I started running the exact same workflow that I ran 3 days ago and expected all tasks to get cache hits, however only one did. Re-running the whole workflow is a very long and quite expensive process.
I shared the workspace cloud-resource-miscellaneous/CBB_20180720_MOAP_TCGA_LUAD_ControlledAccess_V1-0_DATA
with [email protected] Note that this workspace is in the TCGA-dbGaP-authorized authorization domain.
Hello Firecloud Team,
I also tried to re-run workflow that was interrupted last night on the same sample set and out of 30 samples that completed successfully prior to interruption only 2 got call cached.
Thank you.
I've looped in some folks to explore this more. Unfortunately we do not have access to your workspace @birger, due to the authorization domain.
@lelagina and @ruslanafrazer if either of you would be able to share your workspaces, that could help us diagnose why most results seem to not be call-caching.
We have found the problem. Cromwell 34 will not recognize call caches from previous Cromwells, due to a misconfiguration on our end. We hope to release a fix this afternoon.
Thank you for your patience.
thanks @thibault!
Thank you!
@birger @lelagina @ruslanafrazer - we released a hotfix for this last night. Please let us know if you are seeing any more issues.
I am seeing new issues, but it is unclear they are related to the hotfix. I ran the same workflow on 7 sample pairs this morning, and they all failed due to problems pulling the docker image from google container registry (GCR). I'll start a new discussion thread for that issue though.
Call caching is still not running reliably. I ran a workflow first thing this morning, and a particular task (CallSomaticMutations_Prepare_Task) completed successfully with a cache hit. I then reran the same workflow on the same entity later in the day, and the task reported a cache miss. I'm seeing similar behavior for several of the tasks in this workflow.
@birger I've notified the team and will get back to you with more information.
Hello Tiffany,
I apologize but I think these last two issues are related to our docker image being updated.
Sorry about that.
Please ignore the above...the docker image was updated without a tag change.