We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.
Scatter - Gather

The scatter
block is meant to parallelize a series of identical tasks but give them slightly different inputs. The simplest example is:
task inc { Int i command <<< python -c "print(${i} + 1)" >>> output { Int incremented = read_int(stdout()) } } workflow wf { Array[Int] integers = [1,2,3,4,5] scatter(i in integers) { call inc{input: i=i} } }
Running this workflow (which needs no inputs), would yield a value of [2,3,4,5,6]
for wf.inc
. While task inc
itself returns an Int
, when it is called inside a scatter block, that type becomes an Array[Int]
.
Any task that's downstream from the call to inc
and outside the scatter block must accept an Array[Int]
:
task inc { Int i command <<< python -c "print(${i} + 1)" >>> output { Int incremented = read_int(stdout()) } } task sum { Array[Int] ints command <<< python -c "print(${sep="+" ints})" >>> output { Int sum = read_int(stdout()) } } workflow wf { Array[Int] integers = [1,2,3,4,5] scatter (i in integers) { call inc {input: i=i} } call sum {input: ints = inc.increment} }
This workflow will output a value of 20
for wf.sum.sum
. This works because call inc
will output an Array[Int]
because it is in the scatter block.
However, from inside the scope of the scatter block, the output of call inc
is still an Int
. So the following is valid:
workflow wf { Array[Int] integers = [1,2,3,4,5] scatter(i in integers) { call inc {input: i=i} call inc as inc2 {input: i=inc.incremented} } call sum {input: ints = inc2.increment} }
In this example, inc
and inc2
are being called in serial where the output of one is fed to another. inc2 would output the array [3,4,5,6,7]