Oncotator is not annotating any SNP

Haiying7Haiying7 Heidelberg, GermanyMember

I am trying to get annotation for MuTect outcome vcf file. The vcf file form MuTect output is filtered for those that tagged as "PASS".
I used the following command line to run oncotator:
oncotator -v --input_format=VCF --output_format=TCGAMAF $MuTect_dir$sample.vcf $Oncotator_dir$sample.annotated.maf hg19

The log file is:
Verbose mode on
Path:
['/nfs/home/m/kong/.local/bin', '/home/kong/.local/lib/python2.7/site-packages/distribute-0.6.15-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/Oncotator-v1.8.0.0-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/enum34-1.0.4-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/more_itertools-2.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/natsort-4.0.4-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/python_memcached-1.57-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/SQLAlchemy-1.0.9-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/shove-0.6.6-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/Cython-0.23.4-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/biopython-1.66-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/pandas-0.17.0-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/pysam-0.7.5-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/PyVCF-0.6.7-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/bcbio_gff-0.6.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/six-1.10.0-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/stuf-0.9.16-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/futures-3.0.3-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/pytz-2015.7-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/python_dateutil-2.4.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/parse-1.6.6-py2.7.egg', '/home/kong/Haiying/lib/Python/lib/python27.zip', '/home/kong/Haiying/lib/Python/lib/python2.7', '/home/kong/Haiying/lib/Python/lib/python2.7/plat-linux2', '/home/kong/Haiying/lib/Python/lib/python2.7/lib-tk', '/home/kong/Haiying/lib/Python/lib/python2.7/lib-old', '/home/kong/Haiying/lib/Python/lib/python2.7/lib-dynload', '/home/kong/.local/lib/python2.7/site-packages', '/home/kong/Haiying/lib/Python/lib/python2.7/site-packages']

2015-11-23 17:15:41,725 INFO [oncotator.Oncotator:239] Oncotator v1.8.0.0
2015-11-23 17:15:41,725 INFO [oncotator.Oncotator:240] Args: Namespace(allow_overwriting=False, cache_url=None, canonical_tx_file=None, collapse_filter_cols=False, collapse_number_annotations=False, dbDir='/xchip/cga/reference/annotation/db/oncotator_v1_ds_gencode_current/', default_cli=[], default_config=None, genome_build='hg19', infer_genotypes='false', infer_onps=False, input_file='/home/kong/Haiying/Projects/Melanoma/Primary/Lock/MuTect/BQSR_Trimmed/2556_ATCACG_L007_85111_TTAGGC_L001.vcf', input_format='VCF', log_name='oncotator.log', noMulticore=False, output_file='/home/kong/Haiying/Projects/Melanoma/Primary/Lock/Oncotator/BQSR_Trimmed/2556_ATCACG_L007_85111_TTAGGC_L001.annotated.maf', output_format='TCGAMAF', override_cli=[], override_config=None, prepend=False, read_only_cache=False, reannotate_tcga_maf_cols=False, skip_no_alt=False, tx_mode='CANONICAL', verbose=6)
2015-11-23 17:15:41,726 INFO [oncotator.Oncotator:241] Log file: /net/dkfzfsg/gpfs/m/daten/C050-500kdata/Rajiv_group/Haiying/Projects/Melanoma/Primary/Lock/Oncotator/BQSR_Trimmed/oncotator.log
2015-11-23 17:15:41,726 WARNING [oncotator.Oncotator:247] ngslib module not installed. Will be unable to annotate with BigWig datasources.
2015-11-23 17:15:41,727 WARNING [oncotator.DatasourceFactory:260] %s does not exist, so there will be no datasources.
2015-11-23 17:15:41,727 INFO [oncotator.DatasourceFactory:325] No datasources to initialize
2015-11-23 17:15:41,752 INFO [oncotator.output.TcgaMafOutputRenderer:93] Building alternative keys dictionary...
2015-11-23 17:15:41,755 INFO [oncotator.cache.DummyCache:57] No cache specified. All cache attempts will be listed as cache misses.
2015-11-23 17:15:41,755 INFO [oncotator.Annotator:426] Annotating with 0 datasources: Oncotator v1.8.0.0 |
2015-11-23 17:15:41,767 INFO [oncotator.output.TcgaMafOutputRenderer:256] TCGA MAF output file: /home/kong/Haiying/Projects/Melanoma/Primary/Lock/Oncotator/BQSR_Trimmed/2556_ATCACG_L007_85111_TTAGGC_L001.annotated.maf
2015-11-23 17:15:41,767 INFO [oncotator.output.TcgaMafOutputRenderer:257] Render starting...
2015-11-23 17:15:41,768 WARNING [oncotator.Annotator:500] THERE ARE NO DATASOURCES REGISTERED
2015-11-23 17:15:42,300 INFO [oncotator.output.TcgaMafOutputRenderer:342] Rendered all 548 mutations.

I am trying to install ngslib, and see how it works.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi there, it looks like you're not specifying a datasource, so the program has nothing to annotate from. Have a look at the documentation here: http://gatkforums.broadinstitute.org/categories/oncotator

  • Haiying7Haiying7 Heidelberg, GermanyMember

    Dear Geraldine,

    I did specify the input file. $MuTect_dir$sample.vcf is the input file name.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    Is it possible that this happened because ngslib is not installed?

    Thank you so much.

    Issue · Github
    by Sheila

    Issue Number
    380
    State
    closed
    Last Updated
    Milestone
    Array
    Closed By
    vdauwera
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    The datasources are not the same thing as the input file. The input file is the file containing the variants you want to annotate. The datasources are database files that contain the information from which to annotate your variants. The program itself does not contain that information, you have to provide it separately. Please read the documentation I indicated and look for "datasources".

  • Haiying7Haiying7 Heidelberg, GermanyMember

    Dear Geraldine,

    I ran the command initializeDatasource on a COSMIC database, and reran oncotator with the command line:
    oncotator -v --input_format=VCF --output_format=TCGAMAF $MuTect_dir$sample.vcf $sample.annotated.maf hg19 --db-dir /home/kong/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19
    Verbose mode on
    Path:
    ['/nfs/home/m/kong/.local/bin', '/home/kong/.local/lib/python2.7/site-packages/distribute-0.6.15-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/Oncotator-v1.8.0.0-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/enum34-1.0.4-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/more_itertools-2.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/natsort-4.0.4-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/python_memcached-1.57-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/SQLAlchemy-1.0.9-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/shove-0.6.6-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/Cython-0.23.4-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/biopython-1.66-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/pandas-0.17.0-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/PyVCF-0.6.7-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/bcbio_gff-0.6.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/six-1.10.0-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/stuf-0.9.16-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/futures-3.0.3-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/pytz-2015.7-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/python_dateutil-2.4.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/parse-1.6.6-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/pip-7.1.2-py2.7.egg', '/home/kong/.local/lib/python2.7/site-packages/ngslib-1.1.18-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/gevent-1.1rc1-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/greenlet-0.4.9-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages/argparse-1.4.0-py2.7.egg', '/home/kong/Haiying/lib/Python/lib/python2.7', '/net/dkfzfsg/gpfs/m/daten/C050-500kdata/Rajiv_group/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19', '/home/kong/Haiying/lib/Python/lib/python27.zip', '/home/kong/Haiying/lib/Python/lib/python2.7/plat-linux2', '/home/kong/Haiying/lib/Python/lib/python2.7/lib-tk', '/home/kong/Haiying/lib/Python/lib/python2.7/lib-old', '/home/kong/Haiying/lib/Python/lib/python2.7/lib-dynload', '/nfs/home/m/kong/.local/lib/python2.7/site-packages/pysam-0.7.5-py2.7-linux-x86_64.egg', '/home/kong/.local/lib/python2.7/site-packages', '/home/kong/Haiying/lib/Python/lib/python2.7/site-packages']

    2015-12-07 18:44:58,887 INFO [oncotator.Oncotator:239] Oncotator v1.8.0.0
    2015-12-07 18:44:58,888 INFO [oncotator.Oncotator:240] Args: Namespace(allow_overwriting=False, cache_url=None, canonical_tx_file=None, collapse_filter_cols=False, collapse_number_annotations=False, dbDir='/home/kong/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19', default_cli=[], default_config=None, genome_build='hg19', infer_genotypes='false', infer_onps=False, input_file='/home/kong/Haiying/Projects/Melanoma/Primary/Lock/MuTect/BQSR_Trimmed/2501_AAACAT_L006_85111_TTAGGC_L001.vcf', input_format='VCF', log_name='oncotator.log', noMulticore=False, output_file='2501_AAACAT_L006_85111_TTAGGC_L001.annotated.maf', output_format='TCGAMAF', override_cli=[], override_config=None, prepend=False, read_only_cache=False, reannotate_tcga_maf_cols=False, skip_no_alt=False, tx_mode='CANONICAL', verbose=6)
    2015-12-07 18:44:58,888 INFO [oncotator.Oncotator:241] Log file: /net/dkfzfsg/gpfs/m/daten/C050-500kdata/Rajiv_group/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19/oncotator.log
    2015-12-07 18:44:58,889 WARNING [oncotator.DatasourceFactory:266] Potential datasource directory is missing a genome build subdirectory and will be ignored: /home/kong/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19/oreganno.config
    2015-12-07 18:44:58,889 WARNING [oncotator.DatasourceFactory:266] Potential datasource directory is missing a genome build subdirectory and will be ignored: /home/kong/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19/CosmicCompleteExport.tsv
    2015-12-07 18:44:58,890 WARNING [oncotator.DatasourceFactory:266] Potential datasource directory is missing a genome build subdirectory and will be ignored: /home/kong/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19/oncotator.log
    2015-12-07 18:44:58,890 WARNING [oncotator.DatasourceFactory:266] Potential datasource directory is missing a genome build subdirectory and will be ignored: /home/kong/Haiying/Projects/Melanoma/Primary/temp/oncotest/oreganno/hg19/2501_AAACAT_L006_85111_TTAGGC_L001.annotated.maf
    2015-12-07 18:44:58,890 INFO [oncotator.DatasourceFactory:325] No datasources to initialize
    2015-12-07 18:44:58,905 INFO [oncotator.output.TcgaMafOutputRenderer:93] Building alternative keys dictionary...
    2015-12-07 18:44:58,906 INFO [oncotator.cache.DummyCache:57] No cache specified. All cache attempts will be listed as cache misses.
    2015-12-07 18:44:58,907 INFO [oncotator.Annotator:426] Annotating with 0 datasources: Oncotator v1.8.0.0 |
    2015-12-07 18:44:58,915 INFO [oncotator.output.TcgaMafOutputRenderer:256] TCGA MAF output file: 2501_AAACAT_L006_85111_TTAGGC_L001.annotated.maf
    2015-12-07 18:44:58,916 INFO [oncotator.output.TcgaMafOutputRenderer:257] Render starting...
    2015-12-07 18:44:58,916 WARNING [oncotator.Annotator:500] THERE ARE NO DATASOURCES REGISTERED
    2015-12-07 18:44:59,455 INFO [oncotator.output.TcgaMafOutputRenderer:342] Rendered all 532 mutations.

    But none of the SNPs are annotated. Every where is annotated with UNKNOWN.

    The database I downloaded is CosmicCompleteExport.tsv, the command I ran to get datasource is:
    initializeDatasource --ds_type gp_tsv --ds_file /home/kong/Haiying/Reference/CancerAnnotations/COSMIC/CosmicCompleteExport.tsv --name ORegAnno --version "UCSC Track" --dbDir oncotest --ds_foldername oreganno --genome_build hg19 --index_columns hg19.oreganno.chrom,hg19.oreganno.chromStart,hg19.oreganno.chromEnd

    Actually, I am not sure which database I should download.

    I ran MuTect, and got the somatic mutation vcf file, and I need to annotate this results.

    Thank you so much for your help.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    Dear Geraldine,

    I still cannot get annotation. I did read the documents, but it is very confusing. What is hg19 in the example command line? Also, the document says --db-dir is optional, there is a default value. Where should I put the data source? This is a very confusing software document. I need to get my annotations done as soon as possible. I have been struggling several days. I also run it with the test data, still no any thing is annotated.

    Could you please help me?

    Thank you so much.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    It looks like the command you ran to initialize the data sources was not successful. See the error messages that say:

    Potential datasource directory is missing a genome build subdirectory and will be ignored

    What was the output of the initializeDatasource command?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    By the way, we do provide a bunch of pre-prepared datasources for download. See this page.

Sign In or Register to comment.