What do I need to set up to write and execute WDL workflows?

KateNKateN Cambridge, MAMember, Broadie, Moderator
edited September 2017 in Frequently Asked Questions

Below is a list of the basic requirements / things you need to get in order to run workflows written in WDL (using the Cromwell execution engine, because that's what we use), with installation instructions where necessary. Because we use GATK in most of the tutorials and example WDL scripts on this website, we include a link to GATK installation instructions as well, but this is optional if you don’t plan to run the GATK WDLs.

There are additional resources that may help you work with WDL; see the Toolkit page for a full list.


WDL

WDL, pronounced “widdle”, stands for Workflow Description Language.

WDLTool

WDLTool is a utility package that provides accessory functionality for writing and running WDL scripts, including syntax validation and input template generation. You can download the latest release of the pre-compiled executable here.

Text editor

You will need a text editor of some sort to write your WDL scripts. It is important to note that there is a difference between a word processor (like Microsoft Word) and a text editor (like Notepad); please use the latter option. If you have no preferred text editor, we would recommend installing SublimeText, as we find that it displays code visually better than other text editors we've tried. As an added convenience when developing WDL scripted workflows, syntax highlighting has been developed for SublimeText, TextMate, vim, and IntelliJ. You can follow the links for installation instructions for your editor of choice.


Cromwell

Cromwell is an execution engine capable of running scripts written in WDL, describing data processing and analysis workflows involving command line tools (such as pipelines implementing the GATK Best Practices for Variant Discovery). If you are familiar with GATK, you may have heard of or even used an execution engine called Queue that was designed to run GATK workflows written as Qscripts. Together, Cromwell and WDL constitute a user-friendly alternative to Queue and Qscripts.

The installation of Cromwell itself is quite simple. The latest release can be downloaded here in the form of a pre-compiled jar. For ease of use, you can also add an environment variable to your terminal profile pointing at the Cromwell jar file.

Java 8

Cromwell requires Java version 8, which you can find here.

Docker (optional)

Cromwell is capable of utilizing Docker images to assist in specifying environments when running workflows. If you’ve never worked with Docker before, this page may answer many of your questions. Docker is optional if you are simply working on your local machine (i.e. your computer rather than a remote server). If you are using a remote server, more often than not Docker is required. In our tutorials, we always tell you which optional installations will be required.

To use Docker, please install it according to your operating system, following the instructions given on the installation page.


Programs to be pipelined

Our tutorials feature tools from the GATK (GenomeAnalysisToolkit) and Picard to demonstrate how to write WDL scripts that perform real data processing and analysis tasks; in order to follow them you’ll need to install GATK, Picard, and its own dependencies. To that effect, you can find a complete walkthrough for installing these on the GATK website. The linked document provides instructions for installing several additional software packages that are useful for GATK-specific tutorials, but the only one that you really need to install for running WDL tutorials, beside GATK and Picard, is Java 1.7*. Installing the R library gsalib (available on CRAN) is optional but highly recommended. When following along with a tutorial on this website, we will always tell you which optional installations will be required. Note that GATK and Cromwell currently require different versions of Java, so see this article for help dealing with that temporary problem.

*Note: As of version 3.6, GATK runs with Java version 1.8. You will not need Java 1.7 if you use GATK 3.6.

Post edited by Geraldine_VdAuwera on
Tagged:

Comments

  • EADGEADG KielMember

    Heho the link for "add an environment variable to your terminal profile" is broken.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Thanks for reporting, I replaced the link.
  • About the recommendation of editors. Just very first impressions since I have installed wdl today. SublimeText seems to be suited for html. Since wdl is a nested syntax based on curly braces.. it may do with a right extension, but I have not found the right one. However.... , **IntelliJ idea ** has provided me a much better experience. Just opening a sample .wdl file containint text of an example offered me to insall a plugin for the highlight of .wdl and correct indentation. Promising.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi @egarmo, we mainly recommend SublimeText because it is very approachable for people who are relatively new to this sort of thing, whereas IntelliJ can be a bit overwhelming - though the auto-recognition of the syntax is very convenient (I think one of our devs registered that plugin a while back). Note that there are also syntax highlighters available for several editors including SublimeText, which you can find linked on the Toolkit page.

Sign In or Register to comment.