Attention:
The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

Documentation for Oncotator running in Firecloud

SystemSystem Administrator admin
This discussion was created from comments split from: Documentation for Oncotator running in Firecloud.

Comments

  • dannykwellsdannykwells San FranciscoMember ✭✭
    edited September 2018

    Hi @KateN and @bshifaw , sorry for the delay.

    I am actually talking about the output for vcf->maf, not the CNV method.

    The link I gave above link is the only one I can find about the fields from Oncotator in the MAF. In that section there are 262 fields total.

    Here are the fields that come from the Oncotator command in the Firecloud Mutect2 workflow:

    Hugo_Symbol Entrez_Gene_Id  Center  NCBI_Build  Chromosome  Start_position  End_position    Strand  Variant_Classification  Variant_Type    Reference_Allele    Tumor_Seq_Allele1   Tumor_Seq_Allele2   dbSNP_RS    dbSNP_Val_Status    Tumor_Sample_Barcode    Matched_Norm_Sample_Barcode Match_Norm_Seq_Allele1  Match_Norm_Seq_Allele2  Tumor_Validation_Allele1    Tumor_Validation_Allele2    Match_Norm_Validation_Allele1   Match_Norm_Validation_Allele2   Verification_Status Validation_Status   Mutation_Status Sequencing_Phase    Sequence_Source Validation_Method   Score   BAM_file    Sequencer   Tumor_Sample_UUID   Matched_Norm_Sample_UUID    Genome_Change   Annotation_Transcript   Transcript_Strand   Transcript_Exon Transcript_Position cDNA_Change Codon_Change    Protein_Change  Other_Transcripts   Refseq_mRNA_Id  Refseq_prot_Id  SwissProt_acc_Id    SwissProt_entry_Id  Description UniProt_AApos   UniProt_Region  UniProt_Site    UniProt_Natural_Variations  UniProt_Experimental_Info   GO_Biological_Process   GO_Cellular_Component   GO_Molecular_Function   COSMIC_overlapping_mutations    COSMIC_fusion_genes COSMIC_tissue_types_affected    COSMIC_total_alterations_in_gene    Tumorscape_Amplification_Peaks  Tumorscape_Deletion_Peaks   TCGAscape_Amplification_Peaks   TCGAscape_Deletion_Peaks    DrugBank    ref_context gc_content  CCLE_ONCOMAP_overlapping_mutations  CCLE_ONCOMAP_total_mutations_in_gene    CGC_Mutation_Type   CGC_Translocation_Partner   CGC_Tumor_Types_Somatic CGC_Tumor_Types_Germline    CGC_Other_Diseases  DNARepairGenes_Role FamilialCancerDatabase_Syndromes    MUTSIG_Published_Results    OREGANNO_ID OREGANNO_Values t_alt_count t_ref_count 1000gp3_AA  1000gp3_AC  1000gp3_AF  1000gp3_AFR_AF  1000gp3_AMR_AF  1000gp3_AN  1000gp3_CIEND   1000gp3_CIPOS   1000gp3_CS  1000gp3_DP  1000gp3_EAS_AF  1000gp3_END 1000gp3_EUR_AF  1000gp3_IMPRECISE   1000gp3_MC  1000gp3_MEINFO  1000gp3_MEND    1000gp3_MLEN    1000gp3_MSTART  1000gp3_NS  1000gp3_SAS_AF  1000gp3_SVLEN   1000gp3_SVTYPE  1000gp3_TSD ACHILLES_Lineage_Results_Top_Genes  CGC_Cancer Germline Mut CGC_Cancer Molecular Genetics   CGC_Cancer Somatic Mut  CGC_Cancer Syndrome CGC_Chr CGC_Chr Band    CGC_GeneID  CGC_Name    CGC_Other Germline Mut  CGC_Tissue Type COSMIC_FusionGenes_fusion_id    COSMIC_n_overlapping_mutations  COSMIC_overlapping_mutation_descriptions    COSMIC_overlapping_primary_sites    ClinVar_ASSEMBLY    ClinVar_HGMD_ID ClinVar_SYM ClinVar_TYPE    ClinVar_rs  ECNT    Ensembl_so_accession    Ensembl_so_term ExAC_AC ExAC_AC_AFR ExAC_AC_AMR ExAC_AC_Adj ExAC_AC_CONSANGUINEOUS  ExAC_AC_EAS ExAC_AC_FEMALE  ExAC_AC_FIN ExAC_AC_Hemi    ExAC_AC_Het ExAC_AC_Hom ExAC_AC_MALE    ExAC_AC_NFE ExAC_AC_OTH ExAC_AC_POPMAX  ExAC_AC_SAS ExAC_AF ExAC_AN ExAC_AN_AFR ExAC_AN_AMR ExAC_AN_Adj ExAC_AN_CONSANGUINEOUS  ExAC_AN_EAS ExAC_AN_FEMALE  ExAC_AN_FIN ExAC_AN_MALE    ExAC_AN_NFE ExAC_AN_OTH ExAC_AN_POPMAX  ExAC_AN_SAS ExAC_BaseQRankSum   ExAC_CCC    ExAC_CSQ    ExAC_ClippingRankSum    ExAC_DB ExAC_DOUBLETON_DIST ExAC_DP ExAC_DP_HIST    ExAC_DS ExAC_END    ExAC_ESP_AC ExAC_ESP_AF_GLOBAL  ExAC_ESP_AF_POPMAX  ExAC_FS ExAC_GQ_HIST    ExAC_GQ_MEAN    ExAC_GQ_STDDEV  ExAC_HWP    ExAC_HaplotypeScore ExAC_Hemi_AFR   ExAC_Hemi_AMR   ExAC_Hemi_EAS   ExAC_Hemi_FIN   ExAC_Hemi_NFE   ExAC_Hemi_OTH   ExAC_Hemi_SAS   ExAC_Het_AFR    ExAC_Het_AMR    ExAC_Het_EAS    ExAC_Het_FIN    ExAC_Het_NFE    ExAC_Het_OTH    ExAC_Het_SAS    ExAC_Hom_AFR    ExAC_Hom_AMR    ExAC_Hom_CONSANGUINEOUS ExAC_Hom_EAS    ExAC_Hom_FIN    ExAC_Hom_NFE    ExAC_Hom_OTH    ExAC_Hom_SAS    ExAC_InbreedingCoeff    ExAC_K1_RUN ExAC_K2_RUN ExAC_K3_RUN ExAC_KG_AC  ExAC_KG_AF_GLOBAL   ExAC_KG_AF_POPMAX   ExAC_MLEAC  ExAC_MLEAF  ExAC_MQ ExAC_MQ0    ExAC_MQRankSum  ExAC_NCC    ExAC_NEGATIVE_TRAIN_SITE    ExAC_POPMAX ExAC_POSITIVE_TRAIN_SITE    ExAC_QD ExAC_ReadPosRankSum ExAC_VQSLOD ExAC_clinvar_conflicted ExAC_clinvar_measureset_id  ExAC_clinvar_mut    ExAC_clinvar_pathogenic ExAC_culprit    F1R2    F2R1    Familial_Cancer_Genes_Reference Familial_Cancer_Genes_Synonym   HGNC_Accession Numbers  HGNC_CCDS IDs   HGNC_Chromosome HGNC_Date Modified  HGNC_Date Name Changed  HGNC_Date Symbol Changed    HGNC_Ensembl Gene ID    HGNC_Ensembl ID(supplied by Ensembl)    HGNC_Enzyme IDs HGNC_Gene family description    HGNC_HGNC ID    HGNC_Locus Group    HGNC_Locus Type HGNC_Name Synonyms  HGNC_OMIM ID(supplied by NCBI)  HGNC_Previous Names HGNC_Previous Symbols   HGNC_Primary IDs    HGNC_Pubmed IDs HGNC_Record Type    HGNC_RefSeq(supplied by NCBI)   HGNC_Secondary IDs  HGNC_Status HGNC_Synonyms   HGNC_UCSC ID(supplied by UCSC)  HGNC_UniProt ID(supplied by UniProt)    HGNC_VEGA IDs   HGVS_coding_DNA_change  HGVS_genomic_change HGVS_protein_change IN_PON  MBQ MFRL    MMQ MPOS    N_ART_LOD   ORegAnno_bin    POP_AF  P_CONTAM    P_GERMLINE  RU  SA_MAP_AF   SA_POST_PROB    UniProt_alt_uniprot_accessions  allelic_depth   alt_allele_seen annotation_transcript   artifact_in_normal  base_quality    build   ccds_id clustered_events    contamination   dbNSFP_1000Gp1_AC   dbNSFP_1000Gp1_AF   dbNSFP_1000Gp1_AFR_AC   dbNSFP_1000Gp1_AFR_AF   dbNSFP_1000Gp1_AMR_AC   dbNSFP_1000Gp1_AMR_AF   dbNSFP_1000Gp1_ASN_AC   dbNSFP_1000Gp1_ASN_AF   dbNSFP_1000Gp1_EUR_AC   dbNSFP_1000Gp1_EUR_AF   dbNSFP_Ancestral_allele dbNSFP_CADD_phred   dbNSFP_CADD_raw dbNSFP_CADD_raw_rankscore   dbNSFP_ESP6500_AA_AF    dbNSFP_ESP6500_EA_AF    dbNSFP_Ensembl_geneid   dbNSFP_Ensembl_transcriptid dbNSFP_FATHMM_pred  dbNSFP_FATHMM_rankscore dbNSFP_FATHMM_score dbNSFP_GERP++_NR    dbNSFP_GERP++_RS    dbNSFP_GERP++_RS_rankscore  dbNSFP_Interpro_domain  dbNSFP_LRT_Omega    dbNSFP_LRT_converted_rankscore  dbNSFP_LRT_pred dbNSFP_LRT_score    dbNSFP_LR_pred  dbNSFP_LR_rankscore dbNSFP_LR_score dbNSFP_MutationAssessor_pred    dbNSFP_MutationAssessor_rankscore   dbNSFP_MutationAssessor_score   dbNSFP_MutationTaster_converted_rankscore   dbNSFP_MutationTaster_pred  dbNSFP_MutationTaster_score dbNSFP_Polyphen2_HDIV_pred  dbNSFP_Polyphen2_HDIV_rankscore dbNSFP_Polyphen2_HDIV_score dbNSFP_Polyphen2_HVAR_pred  dbNSFP_Polyphen2_HVAR_rankscore dbNSFP_Polyphen2_HVAR_score dbNSFP_RadialSVM_pred   dbNSFP_RadialSVM_rankscore  dbNSFP_RadialSVM_score  dbNSFP_Reliability_index    dbNSFP_SIFT_converted_rankscore dbNSFP_SIFT_pred    dbNSFP_SIFT_score   dbNSFP_SLR_test_statistic   dbNSFP_SiPhy_29way_logOdds  dbNSFP_SiPhy_29way_logOdds_rankscore    dbNSFP_SiPhy_29way_pi   dbNSFP_UniSNP_ids   dbNSFP_Uniprot_aapos    dbNSFP_Uniprot_acc  dbNSFP_Uniprot_id   dbNSFP_aaalt    dbNSFP_aapos    dbNSFP_aapos_FATHMM dbNSFP_aapos_SIFT   dbNSFP_aaref    dbNSFP_cds_strand   dbNSFP_codonpos dbNSFP_fold-degenerate  dbNSFP_genename dbNSFP_hg18_pos(1-coor) dbNSFP_phastCons100way_vertebrate   dbNSFP_phastCons100way_vertebrate_rankscore dbNSFP_phastCons46way_placental dbNSFP_phastCons46way_placental_rankscore   dbNSFP_phastCons46way_primate   dbNSFP_phastCons46way_primate_rankscore dbNSFP_phyloP100way_vertebrate  dbNSFP_phyloP100way_vertebrate_rankscore    dbNSFP_phyloP46way_placental    dbNSFP_phyloP46way_placental_rankscore  dbNSFP_phyloP46way_primate  dbNSFP_phyloP46way_primate_rankscore    dbNSFP_refcodon depth_across_samples    duplicate_evidence  entrez_gene_id  fragment_length gc_content_full gencode_transcript_name gencode_transcript_status   gencode_transcript_tags gencode_transcript_type gene_type   genotype    genotype_quality    germline_risk   havana_transcript   id  mapping_quality matched_norm_sample_barcode multiallelic    n_lod   panel_of_normals    phasing_genotype    phasing_id  phred_scaled_likelihoods    qual    read_depth  read_position   refseq_mrna_id  repeat_times_tandem_repeat_unit secondary_variant_classification    short_tandem_repeat_membership  str_contraction strand_artifact t_lod   t_lod_fstar t_lod_fstar_full    tumor_f
    

    There are 398 fields here, which is substantially more than in the Oncotator spec linked to above.

    Also, these fields do not always align. For example the field "ESP_PH", which is in the original Oncotator (field 148) format defined above, is not found in the fields out of the current Oncotator.

    So, to restate the question, we are trying to figure out which fields from the original oncotator output are defined in the orignal spec, and for those that are not, what do they mean?

    Thank you both for your time.
    -d

  • bshifawbshifaw moonMember, Broadie, Moderator admin
    edited September 2018

    @dannykwells
    Though the WDL/Tool being referenced is on FireCloud, this question is more tool related and may be better answered in the GATK Forum and thus been moved here.

    @LeeTL1220 would you happen to know if there are any updates in documentation for the Oncotator tool used by mutect2 which uses broadinstitute/oncotator:1.9.9.0

    Post edited by bshifaw on
  • dannykwellsdannykwells San FranciscoMember ✭✭

    Hi @LeeTL1220 Thanks for the response. Not sure where to respond since things are moving back and forth.

    My other question is, is there a column with polyphen output? I'd like to be able to know if any particular missense mutations are predicted to be deleterious or inactivating, for example.

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭✭

    @dannykwells I believe dbNSFP annotations include polyphen for exome variants. Otherwise, no. If not obvious from the column headers, just post here again.

Sign In or Register to comment.