Update: July 26, 2019
This section of the forum is no longer actively monitored. We are working on a support migration plan that we will share here shortly. Apologies for this inconvenience.

What directory structure should subworkflow zip have?

gauthiergauthier Member, Broadie, Moderator, Dev admin

I'm trying to run the workflow athttps://github.com/gatk-workflows/five-dollar-genome-analysis-pipeline/blob/master/germline_single_sample_workflow.wdl on a v29 Cromwell server, but the input processing fails. I get the error:
Failed to import workflow ./unmapped_bam_to_aligned_bam.wdl.: File not found /tmp/4518464444991084952.zip3520711126153810536/tasks/unmapped_bam_to_aligned_bam.wdl File not found /unmapped_bam_to_aligned_bam.wdl unmapped_bam_to_aligned_bam.wdl: Name or service not known

The docs claim that the directory structure inside the zip should be the same as in the import statement: http://cromwell.readthedocs.io/en/develop/Imports/

My zip looks like this:

wm963-eb4:~/workspaces/five-dollar-genome-analysis-pipeline $ unzip -l tasks.zip
Archive:  tasks.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  02-16-2018 15:59   tasks/
     4793  02-15-2018 10:36   tasks/alignment.wdl
    10530  02-15-2018 10:36   tasks/bam_processing.wdl
     4399  02-15-2018 10:36   tasks/germline_variant_discovery.wdl
    15156  02-15-2018 10:36   tasks/qc.wdl
     6603  02-15-2018 10:36   tasks/utilities.wdl
    19972  02-15-2018 10:36   unmapped_bam_to_aligned_bam.wdl
---------                     -------
    61453                     7 files

My submit command is curl -s -F [email protected]_single_sample_workflow.wdl -F [email protected]_single_sample_workflow.hg38.inputs.json -F [email protected] -F [email protected] https://cromwell-v29.dsde-methods.broadinstitute.org/api/workflows/v1
where options.json is empty and tasks.zip is as above.

If I move the unmapped_bam_to_aligned_bam.wdl into the tasks folder inside the zip, the error I get is for the next import:

Failed to import workflow tasks/germline_variant_discovery.wdl.:
File not found /tmp/3009484190083449654.zip7335927595109280368/tasks/tasks/germline_variant_discovery.wdl
File not found /tasks/germline_variant_discovery.wdl
tasks%2Fgermline_variant_discovery.wdl: Name or service not known

The /tasks/tasks/ in the error above makes me think that Cromwell is creating a directory for the output of the zip, which is not what I was led to believe by the docs. What should the zip file structure look like for the import statements in this WDL?

Best Answer

Answers

  • ThibThib CambridgeMember, Broadie, Dev ✭✭

    That looks to me like it should work, let me take a look at it.

  • ThibThib CambridgeMember, Broadie, Dev ✭✭

    Actually, once you add split_large_readgroup.wdl, then it becomes the first entry in the zip, so you don't need to rename unmapped_bam_to_aligned_bam.
    So just adding split_large_readgroup.wdl to your tasks.zip should get you going.

  • ChrisLChrisL Cambridge, MAMember, Broadie, Moderator, Dev admin

    @Thib Is this another case of "expecting relative import paths" vs "imports are currently relative to the top-level workflow's directory"?

  • gauthiergauthier Member, Broadie, Moderator, Dev admin

    @Thib After adding split_large_readgroup.wdl to the zip it's become apparent that the list of files in the zip is sorted by date/time added. Otherwise, the workaround of reordering the files does work in version 30. Thanks!

  • I am also struggling with this. Seems like zip imports can never be found when using the imports parameter with a zip file. I am working with Cromwell v31.

    I am unsure if it is something I am doing wrong (zip file structure) or if it is a bug. I put in an issue in the github repo for Cromwell here. Still have yet to figure this out. Thanks in advance for any help/advice on the topic!

Sign In or Register to comment.