TESting, TESting, 1 2 3...
TL;DR :In Cromwell 25, we added a backend named TES to Cromwell’s portfolio, promoting the GA4GH vision of interoperability between genomic analysis tools. We exercise the TES backend using Funnel, a neat piece of software coming out of Kyle Ellrott’s group at OHSU that allows us to dispatch jobs to a variety of platforms using the same API.
The Global Alliance For Genomics & Health (aka GA4GH) is an international coalition formed to enable the sharing of genomic and clinical data in order to help unlock potential advancements in medicine and science. For the most part the GA4GH provides APIs that are implemented by frameworks and tools throughout our field. With these standardized APIs, analysts and software developers are able to take advantage of a much broader and richer ecosystem of tools than they previously were able to. But more on this later. The take home message is that now the scientific community as a whole is able to spend more time working towards bettering humanity instead of just gluing tools together.
Within the GA4GH is the Containers & Workflows (CWF) working group, which focuses on providing APIs that define generic schemes for identifying tools, submitting workflows, and submitting jobs to compute platforms (disclosure: I happen to co-chair this group). At my day job as one of the developers of Cromwell I’m interested in how we can implement these APIs to promote interoperability of different computing platforms in the bioinformatics space.
One problem the CWF group is tackling is how to streamline transferring a job or workflow from one platform to another. Currently there are lots of ways to run jobs from classic High Performance Computing (HPC) clusters such as Sun Grid Engine (SGE), Platform Load Sharing Facility (LSF) or Torque Portable Batch System (PBS) to newer cloud-based platforms such as Google’s Genomics Pipelines API (aka JES) and Amazon’s Batch API. But what if I were able to submit a job to Google’s Pipelines API and then again to Amazon’s Batch using the same interface? For us workflow engine developers this is fantastic as it reduces the code we have to write. For our users it is also great as it means workflows and jobs are much more portable across platforms -- and it's a lot less likely that users will make mistakes at the job submission level since they'll always be doing it the same way!
Enter TES, for Task Execution Schema. TES is the CWF group’s singular API for the sorts of compute platforms I mentioned earlier - SGE, Pipelines API, Batch, etc. If these services were to implement the TES API it would mean that a workflow engine, like Cromwell, could talk to everything using a single code path. As I mention above this would be a huge win for developers, since instead of developing code to talk to all of these platforms, we could do it once and spend more time on user-facing functionality! That being said, it’ll take some time for that interoperability dream to become a reality... But in the meantime there’s a cool piece of software from Kyle Ellrott’s group at Oregon Health & Science University (OHSU) called Funnel that helps bridge this gap. Funnel functions as a TES translator; it receives work requests via the TES API then dispatches these requests to a variety of execution platforms.
Adam Struck and Alex Buchanan from Kyle’s group at OHSU graciously took the time to develop a Cromwell backend for TES. As of Cromwell 25 we now provide this backend for our users. In fact we're also using Funnel in our continuous integration suite to test the new backend. This proved very easy to implement and really demonstrates how smooth using TES and Funnel can be. Even though test suites are not typically what moves crowds, I think it's an exciting development because it brings us closer to our interoperability vision -- we can take our favorite WDL, submit it to Cromwell, and then seamlessly run the jobs on any platform that implements the TES API.
Want to know more or chat about TES and Funnel? Feel free to comment in this post's discussion thread, or start your own discussion on the WDL forum. You're also welcome to join in the CWF group discussions by sending an email to firstname.lastname@example.org.