To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Consistent 503 Error For engine functions

I am using Cromwell 28.2 in production, and when submitting a large number of jobs, 1/4 - 1/3 of them seem to fail with a 503 service unavailable error, when using the size engine functions. The jobs are running on top of Google with the JES backend. I have set the number of retries on API timeout to 5, however, I am not observing any retries for the size function. Instead, the entire WF immediately fails.

From what I can tell, it does not appear that in this version of Cromwell, there are retries happening when the engine function receives a timeout or an error from the Google API. Is this fixed in later versions of Cromwell? Is this something that a config option can fix?


Sign In or Register to comment.