We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Docker - container - image - registry
A container is something quite similar to a virtual machine, which can be used to contain and execute all the software required to run a particular program or set of programs. The container includes an operating system (typically some flavor of Linux) as base, plus any software installed on top of the OS that might be needed. This container can therefore be run as a self-contained virtual environment, which makes it a lot easier to reproduce the same analysis on any infrastructure that supports running the container, from your laptop to a cloud platform, without having to go through the pain of identifying and installing all the software dependencies involved. You can even have multiple containers running on the same machine, so you can easily switch between different environments if you need to run programs that have incompatible system requirements.
Docker is one of several brands of container systems, produced by the Docker company. There are other brands such as Singularity, but Docker is the most popular and widely used. Sometimes we say "a docker" instead of "a container"; it's like when xerox became a regular noun name for copy machines due to the dominance of the Xerox company. However
docker with a lowercase "d" is also the command-line program that you install on your machine to run Docker containers. We'll get back to that in a little while.
A container can be distributed through one or more registries such as Docker Hub (where Broad teams publish most of their docker containers here). There are others, like Dockstore, which is specifically geared toward bioinformatics, and GCR, which is Google's general-purpose container registry for use on the Google Cloud Platform.
In the registry, the container is packaged as an image. Note that this has nothing to do with pictures; here the word "image" is used in the same software-specific way that refers to a special type of file. You know how sometimes when you need to install new software on your computer, the download file is called a "disk image"? That's because the file you download is in a format that your operating system is going to treat as if it was a physical disk on your machine. This is basically the same thing.
So one way to use this, let's say on your laptop, goes like this: you tell the
docker program to download a container image (=file) from a registry, e.g. Docker Hub, then you tell it to initialize the container, which is conceptually equivalent to booting up a virtual machine. And once the container is running, you can run any software inside of it that is installed on its system. For a concrete example, see this Tutorial.
The other way you might use a container is if you're doing work on a cloud-based platform, where almost everything is run through containers. But that's a story for another time.