We've moved!
For WDL questions, see the WDL specification and WDL docs.
For Cromwell questions, see the Cromwell docs and please post any issues on Github.

Scatter Gather and Spark together

dbeckerdbecker MunichMember ✭✭✭


I can't find any recommendations on how to use scatter gather and spark together.
We do panel diagnosics and whole exomes. Therefore a run may contain up to 60 samples. My first idea was to use Scatter Gather to analyse a few samples at the same time. Our server crashed with a concurrent-job-limit > 3 because we ran out of ram and all cores were at 100%. Since I plan to use spark in the future, I wanted to know

  • if it is a good idea to use scatter-gather and spark together,
  • how they work with each other
  • and if there are recommendations how I can calulate the number of cromwell jobs and spark workers from my hardware.

Please just show me the way if there is already a tutorial that answers my question.

Thanks and best regards,

Best Answer


  • dbeckerdbecker MunichMember ✭✭✭

    Is there anybody who thought about that already?

Sign In or Register to comment.