Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
(howto) Test your GATK installation
Test that the GATK is correctly installed, and that the supporting tools like Java are in your path.
- Basic familiarity with the command-line environment
- Understand what is a PATH variable
- GATK downloaded and placed on path
- Invoke the GATK usage/help message
1. Invoke the GATK usage/help message
The command we're going to run is a very simple command that asks the GATK to print out a list of available command-line arguments and options. It is so simple that it will ALWAYS work if your GATK package is installed correctly.
Note that this command is also helpful when you're trying to remember something like the right spelling or short name for an argument and for whatever reason you don't have access to the web-based documentation.
Type the following command:
java -jar <path to GenomeAnalysisTK.jar> --help
<path to GenomeAnalysisTK.jar> bit with the path you have set up in your command-line environment.
You should see usage output similar to the following:
usage: java -jar GenomeAnalysisTK.jar -T <analysis_type> [-I <input_file>] [-L <intervals>] [-R <reference_sequence>] [-B <rodBind>] [-D <DBSNP>] [-H <hapmap>] [-hc <hapmap_chip>] [-o <out>] [-e <err>] [-oe <outerr>] [-A] [-M <maximum_reads>] [-sort <sort_on_the_fly>] [-compress <bam_compression>] [-fmq0] [-dfrac <downsample_to_fraction>] [-dcov <downsample_to_coverage>] [-S <validation_strictness>] [-U] [-P] [-dt] [-tblw] [-nt <numthreads>] [-l <logging_level>] [-log <log_to_file>] [-quiet] [-debug] [-h] -T,--analysis_type <analysis_type> Type of analysis to run -I,--input_file <input_file> SAM or BAM file(s) -L,--intervals <intervals> A list of genomic intervals over which to operate. Can be explicitly specified on the command line or in a file. -R,--reference_sequence <reference_sequence> Reference sequence file -B,--rodBind <rodBind> Bindings for reference-ordered data, in the form <name>,<type>,<file> -D,--DBSNP <DBSNP> DBSNP file -H,--hapmap <hapmap> Hapmap file -hc,--hapmap_chip <hapmap_chip> Hapmap chip file -o,--out <out> An output file presented to the walker. Will overwrite contents if file exists. -e,--err <err> An error output file presented to the walker. Will overwrite contents if file exists. -oe,--outerr <outerr> A joint file for 'normal' and error output presented to the walker. Will overwrite contents if file exists. ...
If you see this message, your GATK installation is ok. You're good to go! If you don't see this message, and instead get an error message, proceed to the next section on troubleshooting.
Let's try to figure out what's not working.
First, make sure that your Java version is at least 1.7, by typing the following command:
You should see something similar to the following text:
java version "1.7.0_12" Java(TM) SE Runtime Environment (build 1.7.0_12-b04) Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
If the version is less then 1.7, install the newest version of Java onto the system. If you instead see something like
java: Command not found
make sure that java is installed on your machine, and that your PATH variable contains the path to the java executables.