To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

String substitution over-quotes string in output section when default value is used

Hi folks,

I'm not sure if I'm not understanding WDL right, or if perhaps Cromwell isn't understanding it the same as me.

When I set an input String to a value, it behaves the way I would expect in terms of how it gets substituted within the command section and within the output section of a task. But when I give an input String a default value and don't bother setting it upon execution, it behaves the way I would expect in the command section, but in the output section, it gets extra double quotes thrown around it, which causes errors for delocalization.

I'm pasting in a minimal working example WDL (the real one I'm working on is more complicated, but this reproduces the behavior more simply). The one input file is publicly available and small, so you should be able to verify it easily.

task test_extract_2 {
  File tarball
  String out_filename_1
  String? out_filename_2="SampleSheet.csv"

  command {
    set -ex -o pipefail
    tar -xzf ${tarball}
    ls -alF ${out_filename_1} # this works and comes out in the stderr log as: ls -alF RunInfo.xml
    ls -alF ${out_filename_2} # this works and comes out in the stderr log as: ls -alF SampleSheet.csv
  }

  output {
    File out_file_1 = "${out_filename_1}" # this works and delocalizes properly
    File out_file_2 = "${out_filename_2}" # this errors, as gsutil looks for a file called \"SampleSheet.csv\" which does not exist
  }

  runtime {
    docker: "phusion/baseimage:0.9.22"
    memory: "2GB"
    cpu: 1
  }
}

workflow test_strings {
  call test_extract_2 {
    input:
      tarball="gs://sabeti-public/dpark-test/AJH8U.tar.gz",
      out_filename_1="RunInfo.xml"
  }
}

Actual output log file excerpt from executing Cromwell v29 (from my Mac using Homebrew) on Google JES backend:

2017/11/14 14:31:54 I: Docker file /cromwell_root/RunInfo.xml maps to host location /mnt/local-disk/RunInfo.xml.
2017/11/14 14:31:54 I: Running command: sudo gsutil -q -m cp -L /var/log/google-genomics/out.log /mnt/local-disk/RunInfo.xml gs://sabeti-temp-30d/dpark/cromwell-test/test_strings/e1d7c90c-05b3-4c12-8edf-2290133f4e92/call-test_extract_2/RunInfo.xml
2017/11/14 14:31:56 I: Deleting log file
2017/11/14 14:31:56 I: Running command: sudo rm -f /var/log/google-genomics/out.log
2017/11/14 14:31:56 I: Switching to status: copied 1 file(s) to "gs://sabeti-temp-30d/dpark/cromwell-test/test_strings/e1d7c90c-05b3-4c12-8edf-2290133f4e92/call-test_extract_2/RunInfo.xml"
2017/11/14 14:31:56 I: Calling SetOperationStatus(copied 1 file(s) to "gs://sabeti-temp-30d/dpark/cromwell-test/test_strings/e1d7c90c-05b3-4c12-8edf-2290133f4e92/call-test_extract_2/RunInfo.xml")
2017/11/14 14:31:56 I: SetOperationStatus(copied 1 file(s) to "gs://sabeti-temp-30d/dpark/cromwell-test/test_strings/e1d7c90c-05b3-4c12-8edf-2290133f4e92/call-test_extract_2/RunInfo.xml") succeeded
2017/11/14 14:31:56 I: Docker file /cromwell_root/"SampleSheet.csv" maps to host location /mnt/local-disk/"SampleSheet.csv".
2017/11/14 14:31:56 I: Running command: sudo gsutil -q -m cp -L /var/log/google-genomics/out.log /mnt/local-disk/"SampleSheet.csv" gs://sabeti-temp-30d/dpark/cromwell-test/test_strings/e1d7c90c-05b3-4c12-8edf-2290133f4e92/call-test_extract_2/"SampleSheet.csv"
2017/11/14 14:31:56 E: command failed: CommandException: No URLs matched: /mnt/local-disk/"SampleSheet.csv"
CommandException: 1 file/object could not be transferred.
 (exit status 1)

Is this a Cromwell bug, or am I doing it wrong?

BTW, I know in this MWE, I could just remove the double quotes from the output section, but in the more complicated version, I'm adding some prefixes and suffixes to the string to get the full filename. I also know that there are older ways of specifying default values in WDL, but I'm using this variable in multiple places in the command and output sections, and I don't want to have to specify the default value multiple times.

Tagged:

Best Answer

Answers

Sign In or Register to comment.