GATK on FireCloud

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
edited January 9 in Pipelining Options

FireCloud is an open platform for secure and scalable analysis on the cloud.

More concretely, it's a web-based portal that is provided as a freely accessible service by the Broad Institute's Data Sciences Platform, where GATK itself is also developed. FireCloud provides both GUI (point-and-click) and API access to a persistent Cromwell execution server that manages submissions to the Google Pipelines API. In addition to the core pipeline execution service, the FireCloud platform also includes functionality for data management, a data library of published datasets (including TCGA data) and a method repository for managing and sharing workflows.

The platform as a whole is designed to empower analysts, tool developers and production managers to perform large-scale analysis, engage in data curation, and store or publish results without having to worry about the underlying computational infrastructure.


All the Best Practices workflows, ready to run

As part of our effort to make it easier for everyone to run GATK regardless of their personal level of (dis)comfort with the intricacies of computational infrastructure, we make all of our Best Practices workflows (plus various additional utilities) available in FireCloud. This takes the form of workspaces where the workflows are preconfigured for common use cases, along with example data that is suitable for testing and benchmarking, both at small scale and at full scale. So it should just be a matter of a few clicks to run any pipeline you like on the preloaded example datasets -- or, with a few more (simple) steps, to run them on your own data. All this without ever touching a command line, unless you're the CLI-over-GUI type, in which case you're welcome to use the FireCloud APIs vis Swagger or the FISS Python bindings to do all this programmatically.

We hope this will enable researchers to spend less time figuring out how to run GATK Best Practices and more time doing interesting science with the results. We also believe this will boost portability and reproducibility in genomic analysis.


Free Credits Program

We understand that moving your analysis to the cloud is a big cultural and logistical shift, and there is a clear need to make it possible to try out such a new option without having to commit financially. To address that need, we've teamed up with Google Cloud to give away free credits for running GATK4 pipelines on FireCloud, our cloud-based analysis portal. Learn more about this free credits program in the FireCloud Free Credits documentation.

Post edited by Geraldine_VdAuwera on
Sign In or Register to comment.