Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Where is the dbSNP information in oncotator result maf?

hexyhexy ChinaMember

Hi!
I use oncotator v1.9 to annotate Tumor-Normal paired somatic variants in MAFLITE format. As the document described, there are 26 datasources:
_Oncotator v1.9.0.0 | Flat File Reference hg19 | GENCODE v19 EFFECT | UniProt_AAxform 2014_12 | ExAC 0.3.1 | dbSNP build 142 | COSMIC v76 | 1000gp3 20130502 | dbNSFP v2.4 | ClinVar 12.03.20 | ORegAnno UCSC Track | CCLE_By_GP 09292010 | UniProt_AA 2014_12 | Ensembl ICGC MUCOPA | COSMIC_FusionGenes v76 | gencode_xref_refseq metadata_v19 | CCLE_By_Gene 09292010 | ACHILLES_Lineage_Results 110303 | CGC full_2012-03-15 | UniProt 2014_12 | HumanDNARepairGenes 20110905 | HGNC Sept172014 | COSMIC_Tissue v76 | Familial_Cancer_Genes 20110905 | TUMORScape 20100104 | TCGAScape 110405 | MutSig Published Results 20110905 _
I got a 348 columns maf for each sample, but information about dbSNP is empty, such as dbSNP_RS(column 14) and dbSNP_Val_Status(column15). The dbSNP populaiton informaiton cannot find, what I mean is something like "1000gp3_EAS_AF". What should I filter the dbSNP database?
How to annotate germline variants? Could I use oncotator?
Thx andvance if anyone help me out~

Best Answer

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @hexy, let's ask @LeeTL1220 to chime in here.

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @hexy In MAF, the dbSNP results should always be in the 14th column dbSNP_RS... Are you saying that the column values are empty?

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @hexy I've replicated a bug for rendering the DBSNP results. A fix should be coming ...

  • hexyhexy ChinaMember

    Thank you, @LeeTL1220, I have sovled it following the new version. And do you think it better to make the strict dependence more lenient, that is to change the dependencies from "==" to ">=" in setup.py file:
    _ install_requires=['bcbio-gff>=0.6.2', 'pyvcf >= 0.6.8', 'pysam >= **0.9.0', 'pandas>=0.18.0', 'biopython>=1.66', 'numpy>=1.11.0', 'cython>=0.24', 'shove>=0.6.6', 'sqlalchemy>=1.0.12', 'nose>=1.3.7', 'python-memcached>=1.57', 'natsort>=4.0.4', 'more-itertools>=2.2', 'enum34>=**1.1.2'],_

    because some the dependencies are too old to find the appropriate library, which makes the installaration difficult~
    Thanks again for helping me!

    Issue · Github
    by Sheila

    Issue Number
    3042
    State
    closed
    Last Updated
    Assignee
    Array
    Closed By
    chandrans
  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @hexy No problem.

    Actually, I have recently created the docker images using the static (i.e. ==) in the setup and virtualenv scripts and I had no trouble. In fact, years ago, we had the >= and that caused many issues, since interfaces or behavior would change and introduce bugs.

  • hexyhexy ChinaMember

    @LeeTL1220,
    Hi, thanks for replying. I used ">=" for all dependencies, otherwise it will be not installed successfully and still now no bugs emerge, thanks again~

Sign In or Register to comment.