Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Oncotator error---IndexError: list index out of range

I used Oncotator (1.9.3.0) to annotate VCF file.
If I run this on command line:
oncotator -v --input_format VCF --output_format TCGAMAF --db-dir /share/apps/oncotator_v1_ds_April052016/ -d . test_oncotator_ffpe.vcf oncotator.maf hg19

There will be an error:

2017-10-17 11:43:39,572 ERROR [oncotator.output.TcgaMafOutputRenderer:333] Traceback (most recent call last):
  File "build/bdist.linux-x86_64/egg/oncotator/output/TcgaMafOutputRenderer.py", line 317, in renderMutations
    self._add_output_annotations(m)
  File "build/bdist.linux-x86_64/egg/oncotator/output/TcgaMafOutputRenderer.py", line 241, in _add_output_annotations
    alt_count = vals[1]
IndexError: list index out of range

2017-10-17 11:43:39,572 ERROR [oncotator.output.TcgaMafOutputRenderer:334] Error at mutation 0 ['1', '11166639', '11166639', 'T', 'A']:
2017-10-17 11:43:39,572 ERROR [oncotator.output.TcgaMafOutputRenderer:335] Incomplete: rendered 0 mutations.
Traceback (most recent call last):
  File "/share/apps/oncotator/bin/oncotator", line 11, in <module>
    load_entry_point('Oncotator==1.9.3.0', 'console_scripts', 'oncotator')()
  File "build/bdist.linux-x86_64/egg/oncotator/Oncotator.py", line 309, in main
  File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 437, in annotate
  File "build/bdist.linux-x86_64/egg/oncotator/output/TcgaMafOutputRenderer.py", line 337, in renderMutations
IndexError: list index out of range

if I run this on command line:
oncotator -v --input_format VCF --output_format VCF --db-dir /share/apps/oncotator_v1_ds_April052016/ -d . test_oncotator_ffpe.vcf oncotator.maf hg19

There will not be an error,this VCF is produced by VARSCAN 2.3.4,fileformat=VCF4.1,normalized by bcftools

And if I run this on command line:
oncotator -v --input_format VCF --output_format TCGAMAF --db-dir /share/apps/oncotator_v1_ds_April052016/ -d . test_oncotator_wes.vcf oncotator.maf hg19

There will not be an error,this VCF is produced by GATK 3.8,fileformat=VCF4.2,normalized by bcftools

Tagged:

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sin
    Hi,
    How did you produce the VCF that errors? Can you run ValidateVariants on it?
    Thanks,
    Sheila

  • sinsin Member

    @Sheila
    I found the reason why it was happened。The VCF produced by VARSACN2 was not standard format。
    Before use oncotator ,use this code to format VCF by VARSCAN2

    bgzip VARSCAN2.vcf \
    tabix VARSCAN2.vcf.gz \
    bcftools norm -m-both -f reference.fasta VARSCAN2.vcf.gz -o VARSCAN2.step1.vcf \
    awk -F '\t'  \
    '$0 !~ /^##FORMAT=<ID=RD/ {OFS="\t"; \
    if($0 ~ /^##FORMAT=<ID=AD/){gsub("Number=1", "Number=R",$0); print $0; next}; \
    if($0 ~ /^#/){print $0; next}; \ 
    split($9, a, ":"); split($10, b, ":"); \
    for(i in a){if(a[i] == "RD"){ind = i}}; gsub("RD:", "", $9); \
    b[ind + 1] = b[ind]","b[ind + 1]; delete b[ind]; c = b[1]; \
    for(i = 2; i <= length(b); i++){if(i != ind){c = c":"b[i]}}; \
    $10 = c; print $0}' \
    VARSCAN2.step1.vcf \
    VARSCAN2.step1.vcf
    

    Thanks for your help

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sin
    Hi,

    Thank you for reporting your solution :smile:

    -Sheila

  • vivekruhelavivekruhela Member

    @Sheila :

    Hi,

    I am facing the same issue while using oncotator to generate MAF files. I have found the same error in many patients mutect2 output file.I also tried ValidateVariants GATK tool and all the vcf files produced by mutect2 have passed validateVariants test. But still, I am getting the error "list index out of range". Here is the details of error message:

    File "build/bdist.linux-x86_64/egg/oncotator/output/TcgaMafOutputRenderer.py", line 321, in renderMutations
    for m in mutations:
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 448, in _applyManualAnnotations
    for m in mutations:
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 456, in _applyDefaultAnnotations
    for m in mutations:
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 504, in _annotate_mutations_using_datasources
    for m in mutations:
    File "build/bdist.linux-x86_64/egg/oncotator/input/VcfInputMutationCreator.py", line 282, in createMutations
    for record in self.vcf_reader:
    File "/usr/local/lib/python2.7/dist-packages/vcf/parser.py", line 572, in next info = self._parse_info(row[7])
    File "/usr/local/lib/python2.7/dist-packages/vcf/parser.py", line 396, in _parse_info vals = entry[1].split(',')
    IndexError: list index out of range

    2018-07-11 22:41:41,575 ERROR [oncotator.output.TcgaMafOutputRenderer:334] Error at mutation 6923 ['1', '69872197', '69872197', 'T', 'C']:
    2018-07-11 22:41:41,575 ERROR [oncotator.output.TcgaMafOutputRenderer:335] Incomplete: rendered 6923 mutations.
    Traceback (most recent call last):
    File "/usr/local/bin/oncotator", line 11, in load_entry_point('Oncotator==1.9.9.0', 'console_scripts', 'oncotator')()
    File "build/bdist.linux-x86_64/egg/oncotator/Oncotator.py", line 311, in main
    File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 437, in annotate
    File "build/bdist.linux-x86_64/egg/oncotator/output/TcgaMafOutputRenderer.py", line 337, in renderMutations
    IndexError: list index out of range

    When I checked that particular line mentioned in the error [Error at mutation 6923 ['1', '69872197', '69872197', 'T', 'C']: ] , I found nothing wrong, it was transition mutation.

    That line is as follows :

    zcat -cd /mnt/storage/MM_Data/SM_21_WES/Variant-Calling/SM_21.mutect2.vcf.gz | grep '69872197'

    chr1 69872197 rs4650000 T C . t_lod_fstar ABHet=0.00;AS_FS=0.000;AS_MQ=60.00;AS_SOR=0.693;DB;DP=2;ECNT=1;FS=0.000;HCNT=1;MAX_ED=.;MIN_ED=.;MQ=60.00;NCC=0;OND=0.00;SOR=2.303;TLOD=5.65 GT:AB:AD:AF:BCS 0/1:0.00:0,2:1.00:0,2,0,0

    Can anyone suggest me how to deal with this error....Thanks.

  • vivekruhelavivekruhela Member

    @Sheila :

    **Updates **: I also tried by remaking its tabix file in case its index file is malfunctioned, that also failed. The only difference I found that with new index file, oncotator covers bit more than previous case (upto 23000 mutations in mutect2 output .vcf file) mutations and then failed. And, Surprisingly, Oncotator is working well with just 4 patients out of 35 patients (while all 35 patients .bam file are processed by Mutect2) . Thanks.

  • vivekruhelavivekruhela Member

    @Sheila :

    :smile:

    Updates : My apologies. Now oncotator is working well but only after some filtering in mutect2 output i.e. if I take only FILTER = PASS values then this filtered file is working well with oncotator rest the original file produced by Mutect2 is still creating problem with oncotator. And I am getting many UNKNOWN in output .MAF file Hugo_Symbol column, I don't know why but now I am sure that oncotator is working well with that filtered file. Problem is with mutect2 output file. So I need your opinion - should I further filter mutect2 output (i.e. remove clustered_event and t_lod_fstar related mutations) because mutations other than PASS in filter are creating problem in further annotation using oncotator (but ANNOVAR annotation package is working well with them).

    Issue · Github
    by Sheila

    Issue Number
    3136
    State
    open
    Last Updated
    Assignee
    Array
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @vivekruhela
    Hi,

    That is odd. Which version of Oncotator are you using?

    Thanks,
    Sheila

  • vivekruhelavivekruhela Member
    edited July 2018

    @Sheila

    Hi,
    Currently, I am using Oncotator v1.9.9.0

    Post edited by vivekruhela on
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @vivekruhela
    Hi,

    I need to ask the developer and get back to you.

    -Sheila

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭
    edited July 2018

    @vivekruhela It may be correct that you are getting a lot of Unknown in the Hugo_Symbol field. Your variants may not overlap genes.

    Also, are you running oncotator with --prune-filter-cols and --collapse-filter-cols ? It sounds like you do not need the variants that are filtered out anyway.

Sign In or Register to comment.