Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Update: July 26, 2019
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.

What directory structure should subworkflow zip have?

gauthiergauthier Member, Broadie, Dev ✭✭✭

I'm trying to run the workflow athttps://github.com/gatk-workflows/five-dollar-genome-analysis-pipeline/blob/master/germline_single_sample_workflow.wdl on a v29 Cromwell server, but the input processing fails. I get the error:
Failed to import workflow ./unmapped_bam_to_aligned_bam.wdl.: File not found /tmp/4518464444991084952.zip3520711126153810536/tasks/unmapped_bam_to_aligned_bam.wdl File not found /unmapped_bam_to_aligned_bam.wdl unmapped_bam_to_aligned_bam.wdl: Name or service not known

The docs claim that the directory structure inside the zip should be the same as in the import statement: http://cromwell.readthedocs.io/en/develop/Imports/

My zip looks like this:

wm963-eb4:~/workspaces/five-dollar-genome-analysis-pipeline $ unzip -l tasks.zip
Archive:  tasks.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  02-16-2018 15:59   tasks/
     4793  02-15-2018 10:36   tasks/alignment.wdl
    10530  02-15-2018 10:36   tasks/bam_processing.wdl
     4399  02-15-2018 10:36   tasks/germline_variant_discovery.wdl
    15156  02-15-2018 10:36   tasks/qc.wdl
     6603  02-15-2018 10:36   tasks/utilities.wdl
    19972  02-15-2018 10:36   unmapped_bam_to_aligned_bam.wdl
---------                     -------
    61453                     7 files

My submit command is curl -s -F [email protected]_single_sample_workflow.wdl -F [email protected]_single_sample_workflow.hg38.inputs.json -F [email protected] -F [email protected] https://cromwell-v29.dsde-methods.broadinstitute.org/api/workflows/v1
where options.json is empty and tasks.zip is as above.

If I move the unmapped_bam_to_aligned_bam.wdl into the tasks folder inside the zip, the error I get is for the next import:

Failed to import workflow tasks/germline_variant_discovery.wdl.:
File not found /tmp/3009484190083449654.zip7335927595109280368/tasks/tasks/germline_variant_discovery.wdl
File not found /tasks/germline_variant_discovery.wdl
tasks%2Fgermline_variant_discovery.wdl: Name or service not known

The /tasks/tasks/ in the error above makes me think that Cromwell is creating a directory for the output of the zip, which is not what I was led to believe by the docs. What should the zip file structure look like for the import statements in this WDL?

Best Answer


  • ThibThib CambridgeMember, Broadie, Dev ✭✭

    That looks to me like it should work, let me take a look at it.

  • ThibThib CambridgeMember, Broadie, Dev ✭✭

    Actually, once you add split_large_readgroup.wdl, then it becomes the first entry in the zip, so you don't need to rename unmapped_bam_to_aligned_bam.
    So just adding split_large_readgroup.wdl to your tasks.zip should get you going.

  • ChrisLChrisL Cambridge, MAMember, Broadie, Dev admin

    @Thib Is this another case of "expecting relative import paths" vs "imports are currently relative to the top-level workflow's directory"?

  • gauthiergauthier Member, Broadie, Dev ✭✭✭

    @Thib After adding split_large_readgroup.wdl to the zip it's become apparent that the list of files in the zip is sorted by date/time added. Otherwise, the workaround of reordering the files does work in version 30. Thanks!

  • I am also struggling with this. Seems like zip imports can never be found when using the imports parameter with a zip file. I am working with Cromwell v31.

    I am unsure if it is something I am doing wrong (zip file structure) or if it is a bug. I put in an issue in the github repo for Cromwell here. Still have yet to figure this out. Thanks in advance for any help/advice on the topic!

Sign In or Register to comment.