Guidelines for working with Docker Images and Dockerfiles (Broadies only)
This article is in the process of being deprecated and replaced by a more cohesive guide for working with Docker images in FireCloud.
Please note that this post is intended for internal Broad users only.
Creating custom analyses to run within FireCloud requires both Workflow Description Language (WDL) and Docker images. FireCloud uses WDL to describe one or more tasks, including commands. WDL can refer to Docker images that package the applications needed to run your pipeline and methods into a discrete environment.
Rather than create a Docker image in a one-off manner and push it to Docker Hub, you should always create a Dockerfile. Dockerfiles describe the software environments and commands to run for the Docker image. Dockerfiles also allow version-controlled changes, the ability to see exactly what is on the image, and the ability to recreate the image as needed.
Below are some recommendations to help your group get started. If you respond to this topic, we can provide suggestions specific to your project.
1. Request a Broad github repo through [email protected] that will contain your group's WDLs and Dockerfiles. If you have both public and private WDLs/Dockerfiles, you may want to create two repos: one for public pipelines and another for private pipelines. If your goal is to publish a public pipeline, you may want to consider the implications of starting with a private repo (i.e., if tools you want to make public are embedded with other private tools).
2. In general, we encourage you to create as few Docker images as possible, and try to reuse images that serve most of your purposes. However, you may want to consider trade-offs if you are building a monolithic Docker image versus several smaller Docker images. For example, monolithic Docker images are typically easier to manage, but tend to have longer download times and may require artificial coupling between applications and software versions (i.e., upgrading one application forces upgrades in an unrelated application).
3. For each Docker image, create a Dockerfile and place it within a subdirectory in your github repo. You may choose to organize WDLs within these directories, or in other directories, but each Dockerfile must be within its own directory.
4. Request a Broad Docker Hub repo for each Dockerfile through [email protected]. Please choose a good name that is easily identifiable, as they will all be under the broadinstitute Docker Hub page. For instance teamname_dockername (e.g., cga_mutsig).
This repo should be setup to automatically build the Docker image using a given Dockerfile in one of the aforementioned sub directories.
Any changes pushed to the github repo will cause an automatic build of the image. You can have different github branches of your repo correspond to different tags. For instance the master branch of github can correspond to the "latest" tag in Docker Hub and the development branch can correspond to an "experimental" tag in Docker Hub.
This will ensure that whenever you refer to a given Docker image in Docker Hub that it is in sync with your version-controlled Dockerfiles and WDLs.