We've moved!
For WDL questions, see the WDL specification and WDL docs.
For Cromwell questions, see the Cromwell docs and please post any issues on Github.

Blue skies ahead for Cromwell on Azure

RuchiRuchi Member, Broadie, Moderator, Dev admin

So far, you could run workflows with Cromwell natively on Google Cloud Platform (GCP), Amazon Web Services (AWS) and Alibaba Cloud -- in addition to running jobs locally and on most HPC clusters, of course. We can now add the big name previously missing from that list: Microsoft Azure. I’m excited to report that we recently received a contribution from a Microsoft group that supports running WDL workflows natively on Azure, as announced on the Microsoft team's blog. This means Cromwell now has at least beta support or better for running workflows natively on all 3 major cloud providers in North America and the largest in Asia.

This development unlocks a big step forward for portability and interoperability of workflow execution across compute environments. For example this is great news for researchers in the GATK community who will now be able to run the Best Practices workflows more easily regardless of which cloud platform(s) they are operating in. More generally, this means you will be able to take advantage of Cromwell's capabilities out of the box on any data stored in Azure.

What's especially cool with this particular Cromwell integration is that it's based on the GA4GH Task Execution Service (TES) API, which aims to standardize how task execution is defined across platforms. This is important because TES is a standard that helps ensure research tools are portable and can be run on different platforms (such as the public clouds mentioned above and various HPCs).

To take it out for a spin, check out the Cromwell on Azure Github repository for details and instructions on how to get started. Please keep in mind that this is new functionality; it is still in a beta phase and testing has been highly focused on running an optimized version of the GATK Best Practices against public data on Azure. Next, the project will seek to add support for popular features like call caching, support for preemptible VMs (low cost machines), and the addition of more scientific pipelines optimized for Azure.

Sign In or Register to comment.