Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Update: July 26, 2019
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.

How to realize and control scatter-gather function with Cromwell in a VM machine

Dear GATK team and other users,

I was trying to optimise my parameter in my 1 VM machine (24 cores) when running the data pre-processing workflow. Basically, I used the wdl script offered by gatk github page. my input contains multiple ubam files. My wdl script use scatter-gather to parallely processing these ubam files.

I feel so confused. I read paper, it said scatter-gather realize cluster level parallel computing which means I need multiple nodes and each nodes will independently process its work at the same time.

Now I only have 1 VM which contains 24 cores. does it means I only can realize multi-threads level parallel computing or actually I also can keep the scatter-gather function in my script and get benefit from it? and How?


  • add info about my VM cup type, CPU model: Intel(R) Xeon(R) CPU E5-2680 v3, with hyper-threading.
    does it means I have 2 physical processor and each owns 12 cores? could I get benefits with the scatter gather function with this setup?

Sign In or Register to comment.