We've moved!
For WDL questions, see the WDL specification and WDL docs.
For Cromwell questions, see the Cromwell docs and please post any issues on Github.

How to realize and control scatter-gather function with Cromwell in a VM machine

Dear GATK team and other users,

I was trying to optimise my parameter in my 1 VM machine (24 cores) when running the data pre-processing workflow. Basically, I used the wdl script offered by gatk github page. my input contains multiple ubam files. My wdl script use scatter-gather to parallely processing these ubam files.

I feel so confused. I read paper, it said scatter-gather realize cluster level parallel computing which means I need multiple nodes and each nodes will independently process its work at the same time.

Now I only have 1 VM which contains 24 cores. does it means I only can realize multi-threads level parallel computing or actually I also can keep the scatter-gather function in my script and get benefit from it? and How?


  • add info about my VM cup type, CPU model: Intel(R) Xeon(R) CPU E5-2680 v3, with hyper-threading.
    does it means I have 2 physical processor and each owns 12 cores? could I get benefits with the scatter gather function with this setup?

Sign In or Register to comment.