Update: July 26, 2019
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.

Standard Library

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

File stdout()

Returns a File reference to the stdout that this task generated.

File stderr()

Returns a File reference to the stderr that this task generated.

Array[String] read_lines(String|File)

Given a file-like object (String, File) as a parameter, this will read each line as a string and return an Array[String] representation of the lines in the file.

The order of the lines in the returned Array[String] must be the order in which the lines appear in the file-like object.

This task would grep through a file and return all strings that matched the pattern:

task do_stuff {
  String pattern
  File file
  command {
    grep '${pattern}' ${file}
  }
  output {
    Array[String] matches = read_lines(stdout())
  }
}

Array[Array[String]] read_tsv(String|File)

the read_tsv() function takes one parameter, which is a file-like object (String, File) and returns an Array[Array[String]] representing the table from the TSV file.

If the parameter is a String, this is assumed to be a local file path relative to the current working directory of the task.

For example, if I write a task that outputs a file to ./results/file_list.tsv, and my task is defined as:

task do_stuff {
  File file
  command {
    python do_stuff.py ${file}
  }
  output {
    Array[Array[String]] output_table = read_tsv("./results/file_list.tsv")
  }
}

Then when the task finishes, to fulfull the outputs_table variable, ./results/file_list.tsv must be a valid TSV file or an error will be reported.

Map[String, String] read_map(String|File)

Given a file-like object (String, File) as a parameter, this will read each line from a file and expect the line to have the format col1\tcol2. In other words, the file-like object must be a two-column TSV file.

This task would grep through a file and return all strings that matched the pattern:

The following task would write a two-column TSV to standard out and that would be interpreted as a Map[String, String]:

task do_stuff {
  String flags
  File file
  command {
    ./script --flags=${flags} ${file}
  }
  output {
    Map[String, String] mapping = read_map(stdout())
  }
}

Object read_object(String|File)

Given a file-like object that contains a 2-row and n-column TSV file, this function will turn that into an Object.

task test {
  command <<<
    python <<CODE
    print('\t'.join(["key_{}".format(i) for i in range(3)]))
    print('\t'.join(["value_{}".format(i) for i in range(3)]))
    CODE
  >>>
  output {
    Object my_obj = read_object(stdout())
  }
}

The command will output to stdout the following:

key_1\tkey_2\tkey_3
value_1\tvalue_2\tvalue_3

Which would be turned into an Object in WDL that would look like this:

Attribute Value
key_1 "value_1"
key_2 "value_2"
key_3 "value_3"

Array[Object] read_objects(String|File)

Given a file-like object that contains a 2-row and n-column TSV file, this function will turn that into an Object.

task test {
  command <<<
    python <<CODE
    print('\t'.join(["key_{}".format(i) for i in range(3)]))
    print('\t'.join(["value_{}".format(i) for i in range(3)]))
    print('\t'.join(["value_{}".format(i) for i in range(3)]))
    print('\t'.join(["value_{}".format(i) for i in range(3)]))
    CODE
  >>>
  output {
    Array[Object] my_obj = read_objects(stdout())
  }
}

The command will output to stdout the following:

key_1\tkey_2\tkey_3
value_1\tvalue_2\tvalue_3
value_1\tvalue_2\tvalue_3
value_1\tvalue_2\tvalue_3

Which would be turned into an Array[Object] in WDL that would look like this:

Index Attribute Value
0 key_1 "value_1"
key_2 "value_2"
key_3 "value_3"
1 key_1 "value_1"
key_2 "value_2"
key_3 "value_3"
2 key_1 "value_1"
key_2 "value_2"
key_3 "value_3"

mixed read_json(String|File)

the read_json() function takes one parameter, which is a file-like object (String, File) and returns a data type which matches the data structure in the JSON file. The mapping of JSON type to WDL type is:

JSON Type WDL Type
object Map[String, ?]
array Array[?]
number Int or Float
string String
boolean Boolean
null ???

If the parameter is a String, this is assumed to be a local file path relative to the current working directory of the task.

For example, if I write a task that outputs a file to ./results/file_list.json, and my task is defined as:

task do_stuff {
  File file
  command {
    python do_stuff.py ${file}
  }
  output {
    Map[String, String] output_table = read_json("./results/file_list.json")
  }
}

Then when the task finishes, to fulfull the output_table variable, ./results/file_list.json must be a valid TSV file or an error will be reported.

Int read_int(String|File)

The read_int() function takes a file path which is expected to contain 1 line with 1 integer on it. This function returns that integer.

String read_string(String|File)

The read_string() function takes a file path which is expected to contain 1 line with 1 string on it. This function returns that string.

No trailing newline characters should be included

Float read_float(String|File)

The read_float() function takes a file path which is expected to contain 1 line with 1 floating point number on it. This function returns that float.

Boolean read_boolean(String|File)

The read_boolean() function takes a file path which is expected to contain 1 line with 1 Boolean value (either "true" or "false" on it). This function returns that Boolean value.

File write_lines(Array[String])

Given something that's compatible with Array[String], this writes each element to it's own line on a file. with newline \n characters as line separators.

task example {
  Array[String] array = ["first", "second", "third"]
  command {
    ./script --file-list=${write_lines(array)}
  }
}

If this task were run, the command might look like:

./script --file-list=/local/fs/tmp/array.txt

And /local/fs/tmp/array.txt would contain:

first
second
third

File write_tsv(Array[Array[String]])

Given something that's compatible with Array[Array[String]], this writes a TSV file of the data structure.

task example {
  Array[String] array = [["one", "two", "three"], ["un", "deux", "trois"]]
  command {
    ./script --tsv=${write_tsv(array)}
  }
}

If this task were run, the command might look like:

./script --tsv=/local/fs/tmp/array.tsv

And /local/fs/tmp/array.tsv would contain:

one\ttwo\tthree
un\tdeux\ttrois

File write_map(Map[String, String])

Given something that's compatible with Map[String, String], this writes a TSV file of the data structure.

task example {
  Map[String, String] map = {"key1": "value1", "key2": "value2"}
  command {
    ./script --map=${write_map(map)}
  }
}

If this task were run, the command might look like:

./script --tsv=/local/fs/tmp/map.tsv

And /local/fs/tmp/map.tsv would contain:

key1\tvalue1
key2\tvalue2

File write_object(Object)

Given any Object, this will write out a 2-row, n-column TSV file with the object's attributes and values.

task test {
  Object input
  command <<<
    /bin/do_work --obj=${write_object(input)}
  >>>
  output {
    File results = stdout()
  }
}

if input were to have the value:

Attribute Value
key_1 "value_1"
key_2 "value_2"
key_3 "value_3"

The command would instantiate to:

/bin/do_work --obj=/path/to/input.tsv

Where /path/to/input.tsv would contain:

key_1\tkey_2\tkey_3
value_1\tvalue_2\tvalue_3

File write_objects(Array[Object])

Given any Array[Object], this will write out a 2+ row, n-column TSV file with each object's attributes and values.

task test {
  Array[Object] in
  command <<<
    /bin/do_work --obj=${write_objects(in)}
  >>>
  output {
    File results = stdout()
  }
}

if in were to have the value:

Index Attribute Value
0 key_1 "value_1"
key_2 "value_2"
key_3 "value_3"
1 key_1 "value_4"
key_2 "value_5"
key_3 "value_6"
2 key_1 "value_7"
key_2 "value_8"
key_3 "value_9"

The command would instantiate to:

/bin/do_work --obj=/path/to/input.tsv

Where /path/to/input.tsv would contain:

key_1\tkey_2\tkey_3
value_1\tvalue_2\tvalue_3
value_4\tvalue_5\tvalue_6
value_7\tvalue_8\tvalue_9

File write_json(mixed)

Given something with any type, this writes the JSON equivalent to a file. See the table in the definition of read_json()

task example {
  Map[String, String] map = {"key1": "value1", "key2": "value2"}
  command {
    ./script --map=${write_json(map)}
  }
}

If this task were run, the command might look like:

./script --tsv=/local/fs/tmp/map.json

And /local/fs/tmp/map.json would contain:

{
  "key1": "value1"
  "key2": "value2"
}

Float size(File, [String])

Given a File and a String (optional), returns the size of the file in Bytes or in the unit specified by the second argument.

task example {
  File input_file

  command {
    echo "this file is 22 bytes" > created_file
  }

  output {
    Float input_file_size = size(input_file)
    Float created_file_size = size("created_file") # 22.0
    Float created_file_size_in_KB = size("created_file", "K") # 0.022
  }
}

Supported units are KiloByte ("K", "KB"), MegaByte ("M", "MB"), GigaByte ("G", "GB"), TeraByte ("T", "TB") as well as their binary version "Ki" ("KiB"), "Mi" ("MiB"), "Gi" ("GiB"), "Ti" ("TiB").
Default unit is Bytes ("B").

Sign In or Register to comment.