Latest Release: 8/9/18
Release Notes can be found here.

BUG: Array[File] populated by read_lines() does not get delocalized paths

I have a script that writes out the relative paths (i.e. ./path/to/file) of its varying outputs to a file.

in the output block of my WDL, I populate an Array[File] with these path via read_lines()

output {
    Array[File] file_paths = read_lines("file_of_file_paths")
}

Running cromwell locally, this works just fine, and file_paths is populated with the delocalized paths of these files (confirmed by running a multi-task workflow to completion in which this is an intermediate task, and another task takes file_paths as an input). In FireCloud, however, file_paths is simply a list of each line of file_of_file_paths rather than a list of delocalized paths.

Tagged:

Answers

  • KateNKateN Cambridge, MAMember, Broadie, Moderator

    Has this worked before for you on FireCloud? I don't think FireCloud currently supports relative paths, as is possible in the local environment, in which case I can put in a feature request for it.

  • dheimandheiman Member, Broadie
    edited December 2017

    Hi @KateN, FireCloud must support relative paths to the working directory, otherwise the following wouldn't work:

    command {
        echo "hello world" > output.txt
    }
    
    output {
        File my_output = "output.txt"
    }
    

    Also, in my case the files are at the top level of the working directory anyways, so even if FireCloud doesn't recurse into relative paths that include directories, that wouldn't apply here.

  • KateNKateN Cambridge, MAMember, Broadie, Moderator

    Apologies for the delay in responding to you; I've been trying to confirm what the expected behavior is here as well as the possible source of the bug. I will let you know as soon as I have a more concrete answer.

  • grushtongrushton Broad InstituteMember, Broadie, Dev

    Hi David,

    In order for Cromwell to prepare a job request to send to JES, it has to know about all of the files that will need to be delocalized after the wdl runs. In this case, cromwell doesn’t have the final list of file names to delocalize prior to running, so it won’t be able to delocalize the necessary files from JES.

  • dheimandheiman Member, Broadie
    edited December 2017

    @grushton, I'm confused, by that explanation, globbing shouldn't work, but it clearly does (which is how I've gotten around this).

    Post edited by dheiman on
  • grushtongrushton Broad InstituteMember, Broadie, Dev

    Hi @dheiman,
    As it has been explained to me, yes, there is support for globbing, but not for your original approach. I will have to defer to a cromwell expert as to why one is supported and the other is not, but you are correct, the globbing approach should work.

  • dheimandheiman Member, Broadie

    @grushton as globbing has been explained to me (https://gatkforums.broadinstitute.org/wdl/discussion/comment/38627/#Comment_38627), it doesn't occur until after the job request has been made, since it uses whatever bash instance is available in the task's docker image. That's why I found your explanation confusing.

    It's also very disorienting that functionality varies depending on the backend. Unless it's been clearly documented, I still consider this inconsistency a bug; the same WDL should run the same way no matter what executes it.

    I updated my affected task to use globs after I originally made this report, but it is a far less elegant solution.

  • grushtongrushton Broad InstituteMember, Broadie, Dev

    @KateN - Please see David's comment above - I think he's correct that this could use some additional documentation.

Sign In or Register to comment.