To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Scatter Gather and Spark together

dbeckerdbecker MunichMember
edited July 2017 in Ask the WDL team


I can't find any recommendations on how to use scatter gather and spark together.
We do panel diagnosics and whole exomes. Therefore a run may contain up to 60 samples. My first idea was to use Scatter Gather to analyse a few samples at the same time. Our server crashed with a concurrent-job-limit > 3 because we ran out of ram and all cores were at 100%. Since I plan to use spark in the future, I wanted to know

  • if it is a good idea to use scatter-gather and spark together,
  • how they work with each other
  • and if there are recommendations how I can calulate the number of cromwell jobs and spark workers from my hardware.

Please just show me the way if there is already a tutorial that answers my question.

Thanks and best regards,

Best Answer


  • dbeckerdbecker MunichMember

    Is there anybody who thought about that already?

Sign In or Register to comment.