Forum Login Issue:
Currently the "Log in with Google" button redirects you to a "Page not found." Our forum vendors have implemented a fix, and now we are just waiting on a patch to be released. In the meantime, while on the "Page not found" you can edit the URL to delete the second gatk, firecloud, or wdl (depending on what subforum you are acessing).

Scatter parallelism on GCP

Hello, newbie question here - how does Cromwell achieve parallelism on the Google Cloud Platform amongst multiple nodes? We have a group of tasks that are part of the workflow that we know could benefit from getting parallelized. We'd like to divide up this work between multiple nodes and were wondering how scatter works behind the scenes.

How do you specify the CPU and/or GPU for these nodes?
How does Cromwell figure out how many nodes should be started? (how does it know when to shut them down?)
Can Cromwell be tied in with Kubernetes?
Does the gather step retrieve all the output files from the scatter'd nodes and copy them to the node running the Cromwell server?
What's the precise notion that Cromwell uses to know that a scatter node has finished its processing?

We're just getting started with Cromwell, so apologies for the beginner questions and thanks in advance!

  • Ed

Best Answer


Sign In or Register to comment.