What does 'scatter (sample_index in range(length(CollectCounts.entity_id)))' do?

shleeshlee CambridgeMember, Broadie, Moderator admin
edited November 2018 in Ask the Cromwell + WDL Team


I need to know what scatter (sample_index in range(length(CollectCounts.entity_id))) does. This is in the context of the gCNV workflow and this line comes from https://github.com/broadinstitute/gatk/blob/

I assumed it was looking for the length of the array based on the WDL specification for length:



  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev admin
    edited November 2018

    You're correct, it does the following:

    • CollectCounts.entity_id is an array
    • length(CollectCounts.entity_id) calculates the length of that array
    • range(length(CollectCounts.entity_id)) produces a new array of integers up to the length of the original array (eg: [0, 1, 2, ..., 204]

    So scattering over that expression will give you an integer representing an index in that array, and that's how it's used in the task - to find an entry in the original array and to provide the task with the index of that file within the original array.

Sign In or Register to comment.