Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

What does an error message mean?

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
edited July 2014 in Oncotator documentation


This document lists error messages that you may encounter when using Oncotator. For each, you will find an explanation of what the error message means, and how to solve the problem (if possible).

Entrez Gene ID was zero, but Hugo Symbol was not Unknown

Problem: The current GENCODE datasource does not have a complete list of the Entrez Gene IDs.

Solution There is no solution yet. This is a known issue and will be solved in a future release of the default datasource corpus.

oncotator.DuplicateAnnotationException.DuplicateAnnotationException: 'Attempting to create an annotation multiple times

Problem: This occurs when Oncotator is being directed to create the same annotation on a mutation, but each with different values. Typically, this is caused by a maflite/tsv input file with redundant columns. For example, "Start_Position" and "start". Oncotator does use aliases, so the annotation could have different column header text, but be regarded as the same annotation.

Solution: You need to clean up the columns in your input file to avoid such conflicts.

Post edited by Geraldine_VdAuwera on


  • HyosilKimHyosilKim Seoul, KoreaMember
    edited March 2015

    ****What does this error mean?****
    please, help me.

    Verbose mode on
    ['/usr/bin', '/usr/lib/python2.7/site-packages/distribute-0.6.15-py2.7.egg', '/usr/lib/python2.7/site-packages/Oncotator-v1.5.1.0-py2.7.egg', '/usr/lib/python2.7/site-packages/enum34-1.0.4-py2.7.egg', '/usr/lib/python2.7/site-packages/more_itertools-2.2-py2.7.egg', '/usr/lib/python2.7/site-packages/natsort-3.5.2-py2.7.egg', '/usr/lib/python2.7/site-packages/python_memcached-1.53-py2.7.egg', '/usr/lib/python2.7/site-packages/nose-1.3.4-py2.7.egg', '/usr/lib/python2.7/site-packages/SQLAlchemy-0.9.8-py2.7-linux-x86_64.egg', '/usr/lib/python2.7/site-packages/shove-0.5.6-py2.7.egg', '/usr/lib/python2.7/site-packages/Cython-0.22-py2.7-linux-x86_64.egg', '/usr/lib/python2.7/site-packages/biopython-1.65-py2.7-linux-x86_64.egg', '/usr/lib/python2.7/site-packages/pandas-0.15.2-py2.7-linux-x86_64.egg', '/usr/lib/python2.7/site-packages/pysam-0.7.5-py2.7-linux-x86_64.egg', '/usr/lib/python2.7/site-packages/bcbio_gff-0.6.2-py2.7.egg', '/usr/lib/python2.7/site-packages/stuf-0.9.4-py2.7.egg', '/usr/lib/python2.7/site-packages/futures-2.2.0-py2.7.egg', '/usr/lib/python2.7/site-packages/python_dateutil-2.4.0-py2.7.egg', '/usr/lib/python2.7/site-packages/parse-1.4.1-py2.7.egg', '/usr/lib64/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7/site-packages', '/usr/lib64/python2.7/site-packages/gtk-2.0', '/usr/lib/python2.7/site-packages']

    2015-03-04 17:13:09,050 INFO [oncotator.Oncotator:235] Oncotator v1.5.1.0
    2015-03-04 17:13:09,051 INFO [oncotator.Oncotator:236] WARNING [oncotator.DatasourceFactory:260] %s does not exist, so there will be no datasources.
    2015-03-04 17:13:09,056 WARNING [oncotator.Annotator:466] THERE ARE NO DATASOURCES REGISTERED
    2015-03-04 17:13:09,454 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 1000 mutations.
    2015-03-04 17:13:09,849 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 2000 mutations.
    2015-03-04 17:13:10,218 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 3000 mutations.
    2015-03-04 17:13:10,556 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 4000 mutations.
    2015-03-04 17:13:10,896 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 5000 mutations.
    2015-03-04 17:13:11,242 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 6000 mutations.
    2015-03-04 17:13:11,580 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 7000 mutations.
    2015-03-04 17:13:11,926 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 8000 mutations.
    2015-03-04 17:13:12,266 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 9000 mutations.
    2015-03-04 17:13:12,611 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 10000 mutations.
    2015-03-04 17:13:12,950 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 11000 mutations.
    2015-03-04 17:13:13,293 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 12000 mutations.
    2015-03-04 17:13:13,633 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 13000 mutations.
    2015-03-04 17:13:13,970 INFO [oncotator.output.TcgaMafOutputRenderer:276] Rendered 14000 mutations.

    2015-03-04 17:13:14,109 INFO [oncotator.output.TcgaMafOutputRenderer:288] Rendered all 14402 mutations.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    It seems you did not provide a datasource to the program. What was your command line?

  • egeulgenegeulgen USMember

    @Geraldine_VdAuwera Is there any work around for the "Entrez Gene ID was zero, but Hugo Symbol was not Unknown" issue?

  • BayazitBayazit Estonian BiocentreMember

    Hi everyone!

    I am trying to annotate my vcf file (prepared using GATK Best Practices workflow) using oncotator :
    AND here is my command:

    oncotator -v --log_name=oncotator.SNPs.log \
    --input_format=VCF --output_format=VCF \
    --db-dir=/seqdata/Oncotator_Datasource/oncotator_v1_ds_April052016 \
    /nextera_rapid_capture_exome/varcall_filtered/SNPs.vcf \
    /nextera_rapid_capture_exome/varcall_filtered/SNPs.oncotator.vcf hg19

    I am getting the following error message:
    2017-01-12 18:22:52,284 WARNING [oncotator.output.OutputDataManager:521] Annotation or config file specifying is not split for field type (INFO) with Number=A name: ExAC_Hom_SAS A datasource or VCF file may be misconfigured.
    2017-01-12 18:22:52,284 WARNING [oncotator.output.OutputDataManager:521] Annotation or config file specifying is not split for field type (INFO) with Number=A name: ExAC_AC_Hemi A datasource or VCF file may be misconfigured.
    2017-01-12 18:22:52,284 WARNING [oncotator.output.OutputDataManager:521] Annotation or config file specifying is not split for field type (INFO) with Number=A name: ExAC_clinvar_pathogenic A datasource or VCF file may be misconfigured.
    2017-01-12 18:22:52,285 INFO [oncotator.utils.ConfigUtils:197] Could not find config file (vcf.out.config). Trying configs/ prepend.
    2017-01-12 18:22:52,285 INFO [oncotator.utils.ConfigUtils:199] Found config file (vcf.out.config) using configs/ prepend.
    2017-01-12 18:22:52,287 INFO [oncotator.utils.SampleNameSelector:90] Sample name is in the sample_name column.
    2017-01-12 18:23:02,769 INFO [oncotator.output.OutputDataManager:194] Wrote 1000 mutations to tsv.
    2017-01-12 18:23:13,408 INFO [oncotator.output.OutputDataManager:194] Wrote 2000 mutations to tsv.
    2017-01-12 18:23:25,661 INFO [oncotator.output.OutputDataManager:194] Wrote 3000 mutations to tsv.
    2017-01-12 18:23:30,025 ERROR [oncotator.input.VcfInputMutationCreator:328] SPANNING DELETIONS ARE NOT SUPPORTED.
    Traceback (most recent call last):
    File "/storage/software/python-2.7.11/bin/oncotator", line 11, in
    load_entry_point('Oncotator==', 'console_scripts', 'oncotator')()
    File "build/bdist.linux-x86_64/egg/oncotator/Oncotator.py", line 309, in main
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 437, in annotate
    File "build/bdist.linux-x86_64/egg/oncotator/output/VcfOutputRenderer.py", line 119, in renderMutations
    File "build/bdist.linux-x86_64/egg/oncotator/output/OutputDataManager.py", line 92, in init
    File "build/bdist.linux-x86_64/egg/oncotator/output/OutputDataManager.py", line 160, in _writeMuts2Tsv
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 448, in _applyManualAnnotations
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 456, in _applyDefaultAnnotations
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 504, in _annotate_mutations_using_datasources
    File "build/bdist.linux-x86_64/egg/oncotator/input/VcfInputMutationCreator.py", line 329, in createMutations
    oncotator.utils.OncotatorException.OncotatorException: Spanning deletions are not supported at this time.

    Could you please help me with this?

  • BayazitBayazit Estonian BiocentreMember
    edited January 2017

    Hi everyone!
    I have finally found solution to this error message ( that I posted before):
    here it is:

    I have followed the error message 'File "build/bdist.linux-x86_64/egg/oncotator/input/VcfInputMutationCreator.py", line 329, in createMutations'
    and checked this Python code in https://github.com/broadinstitute/oncotator/blob/develop/oncotator/input/VcfInputMutationCreator.py
    and found that this ERROR is cased by the presence of '*' symbol in the ALT field: i.e. by the presence of spanning deletions
    in my input VCF file

    basically - this is what error message says )))

    NOTE that use of SelectVariants with --selectTypeToExclude INDEL --selectTypeToExclude MIXED --selectTypeToExclude SYMBOLIC

    will not remove * - spanning deletions

    I have removed variants with spanning deletions (there were only few) using grep

    check first manually - what is going to be removed:

    zcat myfilename.vcf.gz | grep -E ',*|*,' | awk '{print $1,$2,$3,$4,$5,$6}' | more

    remove *'s:

    zcat myfilename.vcf.gz | grep -E --invert-match ',*|*,' > no_spanning_dels.vcf

Sign In or Register to comment.