(howto) Install and run Oncotator for the first time

Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,672Administrator, GATK Developer admin
edited November 3 in Oncotator Documentation

1. Download the Oncotator package and the default datasources package from the Downloads page

Please note: Broadies who wish to run the installed Oncotator on the Broad cluster should follow the instructions here, instead of this page

Oncotator Download


Default Datasource Corpus Download (Sept 17, 2014)

Download 12GB

Both packages are simple tar files that can be expanded using the following commands:

$ tar zxvf oncotator-
$ tar zxvf oncotator_v1_ds_June112014.tar.gz

This will produce two directories called oncotator- and oncotator_v1_ds_June112014, respectively. Move to the oncotator- directory by doing:

$ cd oncotator-

2. Set up your Python environment and install dependencies

See the article on platform requirements for a full list of dependencies. This tutorial will show you how to use the virtual environment script we provide to set everything up automagically, and this tutorial will show you how to install dependencies manually if needed (or preferred).

3. Install Oncotator

Once you have installed all the necessary dependencies listed above, simply run the standard Python install script which is included with the Oncotator distribution.

$ python setup.py install

Two binaries (executable program files) named oncotator and initializeDatasource respectively will be installed into your Python's bin/ directory. You can test that they were installed by running e.g.:

$ oncotator -h 

to invoke the help / usage instructions. You can also do a test run of Oncotator on the Patient0.snp.maf.txt file provided with the Oncotator distribution (in the test/testdata/maflite/ directory) with the following command:

$ oncotator -v --db-dir=~/sandbox/oncotator/oncotator_v1_ds_June112014 test/testdata/maflite/Patient0.snp.maf.txt exampleOutput.tsv hg19

where you provide the location of the datasources using the --db-dir argument. You may need to adapt the file path for the Patient0.snp.maf.txt file depending on where you run this command from.

This will produce a new file named exampleOutput.tsv with the appropriate annotations, built against the hg19 reference.

Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD


Sign In or Register to comment.