Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Update: July 26, 2019
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.

Scatter Gather and Spark together

dbeckerdbecker MunichMember ✭✭✭


I can't find any recommendations on how to use scatter gather and spark together.
We do panel diagnosics and whole exomes. Therefore a run may contain up to 60 samples. My first idea was to use Scatter Gather to analyse a few samples at the same time. Our server crashed with a concurrent-job-limit > 3 because we ran out of ram and all cores were at 100%. Since I plan to use spark in the future, I wanted to know

  • if it is a good idea to use scatter-gather and spark together,
  • how they work with each other
  • and if there are recommendations how I can calulate the number of cromwell jobs and spark workers from my hardware.

Please just show me the way if there is already a tutorial that answers my question.

Thanks and best regards,

Best Answer


  • dbeckerdbecker MunichMember ✭✭✭

    Is there anybody who thought about that already?

Sign In or Register to comment.