Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

[GATK 4.1.0.0] Funcotator issues.

Hi, I have questions in using Funcotator in GATK 4.1.0.0. My input vcf uses b37 (GRCh37) as a reference.

1) What are chr1_a_bed and chr1_b_bed in the resource bundle, funcotator_dataSources.v1.6.20190124s.tar.gz?
2) Would you guide me how to localize gnomAD resources? It is configured to use google cloud, which is not allowed in my computing environment unfortunately.
3) A lot of fields in Funcotator output are empty. I feel I am missing something. For over 13000 variants, I don't get any variant annotated by Cosmic, SwissProt, GO, TCGAscape, DrugBank, CCLE, CGC, ClinVar, Familial Cancer Genes, HGNC, etc. Would you help me annotate variants as much as possible?

Thank you!

Tagged:

Best Answers

Answers

  • ivan108ivan108 SFMember

    I also see most of Funcotator fields are empty. Only Gencode fields are well populated. What am I missing?

    Thanks!

  • bshifawbshifaw Member, Broadie, Moderator admin

    Hi @dayzcool ,

    Have you already read through Funcotator Information and Tutorial and the Funcotator Tool Docs. It should have all the info you're looking for including where to get the resource data, but let me know if it doesn't.

  • dayzcooldayzcool Member

    @bshifaw
    Thanks for pointing me the documentation! Sorry I should have been clearer. I have downloaded the resource bundle, funcotator_dataSources.v1.6.20190124s.tar.gz, and could annotate vcf files using Funcotator.
    I couldn't figure out why most resources in the bundle aren't seemingly used for annotating variants. For instance, variants are annotated by dbSNP, but not by COSMIC. In fact, most resources in the bundle don't seem to be used for annotation.
    I wonder why Funcotator can't annotate my vcf properly and how to find the reason for it.

  • bshifawbshifaw Member, Broadie, Moderator admin

    @dayzcool

    I'll need to refer to the dev team. Mind sharing your funcotator command?

  • dayzcooldayzcool Member

    @bshifaw Thanks for your help. Here is the command:

        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx2000m -jar /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/-1075985506/gatk.jar Funcotator --data-sources-path /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s --ref-version hg19 --output-file-format MAF -R /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/400411378/Homo_sapiens_assembly19.fasta -V /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/117600588/NA12878_sim_tumor-filtered.vcf.gz -O NA12878_sim_tumor-filtered.vcf.gz.maf.annotated --transcript-selection-mode CANONICAL --annotation-default normal_barcode:NA12878 --annotation-default tumor_barcode:NA12878,sim,tumor --annotation-default Center:CPGM-CMH --annotation-default source:Unknown --remove-filtered-variants
    

    stderr:

    Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/tmp.98c8be50
    13:16:33.192 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/data/software/GATK/gatk-4.1.0.0/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    13:16:35.055 INFO  Funcotator - ------------------------------------------------------------
    13:16:35.057 INFO  Funcotator - The Genome Analysis Toolkit (GATK) v4.1.0.0
    13:16:35.057 INFO  Funcotator - For support and documentation go to https://software.broadinstitute.org/gatk/
    13:16:35.058 INFO  Funcotator - Executing as [email protected] on Linux v3.10.0-327.36.3.el7.x86_64 amd64
    13:16:35.059 INFO  Funcotator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_111-b15
    13:16:35.060 INFO  Funcotator - Start Date/Time: March 3, 2019 1:16:33 PM CST
    13:16:35.060 INFO  Funcotator - ------------------------------------------------------------
    13:16:35.060 INFO  Funcotator - ------------------------------------------------------------
    13:16:35.061 INFO  Funcotator - HTSJDK Version: 2.18.2
    13:16:35.061 INFO  Funcotator - Picard Version: 2.18.25
    13:16:35.061 INFO  Funcotator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    13:16:35.062 INFO  Funcotator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    13:16:35.062 INFO  Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    13:16:35.062 INFO  Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    13:16:35.062 INFO  Funcotator - Deflater: IntelDeflater
    13:16:35.063 INFO  Funcotator - Inflater: IntelInflater
    13:16:35.063 INFO  Funcotator - GCS max retries/reopens: 20
    13:16:35.063 INFO  Funcotator - Requester pays: disabled
    13:16:35.063 INFO  Funcotator - Initializing engine
    13:16:35.556 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/117600588/NA12878_sim_tumor-filtered.vcf.gz
    13:16:35.751 INFO  Funcotator - Done initializing engine
    13:16:35.752 INFO  Funcotator - Validating Sequence Dictionaries...
    13:16:35.753 INFO  Funcotator - Processing user transcripts/defaults/overrides...
    13:16:35.754 INFO  Funcotator - Initializing data sources...
    13:16:35.755 INFO  DataSourceUtils - Initializing data sources from directory: /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s
    13:16:35.759 INFO  DataSourceUtils - Data sources version: 1.6.2019124s
    13:16:35.760 INFO  DataSourceUtils - Data sources source: ftp://[email protected]/bundle/funcotator/funcotator_dataSources.v1.6.20190124s.tar.gz
    13:16:35.760 INFO  DataSourceUtils - Data sources alternate source: gs://broad-public-datasets/funcotator/funcotator_dataSources.v1.6.20190124s.tar.gz
    13:16:35.786 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/CancerGeneCensus_Table_1_full_2012-03-15.txt -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cancer_gene_census/hg19/CancerGeneCensus_Table_1_full_2012-03-15.txt
    13:16:35.794 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/cosmic_tissue.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cosmic_tissue/hg19/cosmic_tissue.tsv
    13:16:35.801 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/hgnc_download_Nov302017.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/hgnc/hg19/hgnc_download_Nov302017.tsv
    13:16:35.809 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/oreganno.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/oreganno/hg19/oreganno.tsv
    13:16:35.816 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/achilles_lineage_results.import.txt -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/achilles/hg19/achilles_lineage_results.import.txt
    13:16:35.824 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/clinvar_hgmd.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/clinvar/hg19/clinvar_hgmd.tsv
    13:16:35.831 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode_xrefseq_v75_37.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode_xrefseq/hg19/gencode_xrefseq_v75_37.tsv
    13:16:35.838 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/Familial_Cancer_Genes.no_dupes.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/familial/hg19/Familial_Cancer_Genes.no_dupes.tsv
    13:16:35.845 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode_xhgnc_v75_37.hg19.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode_xhgnc/hg19/gencode_xhgnc_v75_37.hg19.tsv
    13:16:35.853 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/simple_uniprot_Dec012014.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/simple_uniprot/hg19/simple_uniprot_Dec012014.tsv
    13:16:35.860 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/cosmic_fusion.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cosmic_fusion/hg19/cosmic_fusion.tsv
    13:16:35.868 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/Cosmic.db -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cosmic/hg19/Cosmic.db
    13:16:35.875 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/dnaRepairGenes.20180524T145835.csv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dna_repair_genes/hg19/dnaRepairGenes.20180524T145835.csv
    13:16:35.882 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/hg19_All_20170710.vcf.gz -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dbsnp/hg19/hg19_All_20170710.vcf.gz
    13:16:35.889 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode.v19.annotation.REORDERED.gtf -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode/hg19/gencode.v19.annotation.REORDERED.gtf
    13:16:35.891 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode.v19.pc_transcripts.fa -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode/hg19/gencode.v19.pc_transcripts.fa
    13:16:35.891 INFO  Funcotator - Finalizing data sources (this step can be long if data sources are cloud-based)...
    13:16:35.892 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/CancerGeneCensus_Table_1_full_2012-03-15.txt -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cancer_gene_census/hg19/CancerGeneCensus_Table_1_full_2012-03-15.txt
    13:16:35.901 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/cosmic_tissue.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cosmic_tissue/hg19/cosmic_tissue.tsv
    13:16:35.988 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/hgnc_download_Nov302017.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/hgnc/hg19/hgnc_download_Nov302017.tsv
    13:16:36.169 INFO  DataSourceUtils - Setting lookahead cache for data source: Oreganno : 100000
    13:16:36.175 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/oreganno.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/oreganno/hg19/oreganno.tsv
    13:16:36.200 INFO  FeatureManager - Using codec XsvLocatableTableCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/oreganno/hg19/oreganno.config
    13:16:36.283 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/oreganno.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/oreganno/hg19/oreganno.tsv
    13:16:36.283 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/oreganno.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/oreganno/hg19/oreganno.tsv
    WARNING 2019-03-03 13:16:36     AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
    13:16:36.288 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/achilles_lineage_results.import.txt -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/achilles/hg19/achilles_lineage_results.import.txt
    13:16:36.292 INFO  DataSourceUtils - Setting lookahead cache for data source: ClinVar : 100000
    13:16:36.294 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/clinvar_hgmd.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/clinvar/hg19/clinvar_hgmd.tsv
    13:16:36.321 INFO  FeatureManager - Using codec XsvLocatableTableCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/clinvar/hg19/clinvar_hgmd.config
    13:16:36.519 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/clinvar_hgmd.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/clinvar/hg19/clinvar_hgmd.tsv
    13:16:36.520 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/clinvar_hgmd.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/clinvar/hg19/clinvar_hgmd.tsv
    WARNING 2019-03-03 13:16:36     AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
    13:16:36.521 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode_xrefseq_v75_37.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode_xrefseq/hg19/gencode_xrefseq_v75_37.tsv
    13:16:36.645 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/Familial_Cancer_Genes.no_dupes.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/familial/hg19/Familial_Cancer_Genes.no_dupes.tsv
    13:16:36.648 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode_xhgnc_v75_37.hg19.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode_xhgnc/hg19/gencode_xhgnc_v75_37.hg19.tsv
    13:16:37.429 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/simple_uniprot_Dec012014.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/simple_uniprot/hg19/simple_uniprot_Dec012014.tsv
    13:16:37.560 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/cosmic_fusion.tsv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cosmic_fusion/hg19/cosmic_fusion.tsv
    13:16:37.565 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/Cosmic.db -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/cosmic/hg19/Cosmic.db
    13:16:37.712 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/dnaRepairGenes.20180524T145835.csv -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dna_repair_genes/hg19/dnaRepairGenes.20180524T145835.csv
    13:16:37.714 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/hg19_All_20170710.vcf.gz -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dbsnp/hg19/hg19_All_20170710.vcf.gz
    13:16:37.714 INFO  DataSourceUtils - Setting lookahead cache for data source: dbSNP : 100000
    13:16:37.741 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dbsnp/hg19/hg19_All_20170710.vcf.gz
    13:16:37.835 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/hg19_All_20170710.vcf.gz -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dbsnp/hg19/hg19_All_20170710.vcf.gz
    13:16:37.889 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dbsnp/hg19/hg19_All_20170710.vcf.gz
    13:16:37.889 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/dbsnp/hg19/hg19_All_20170710.vcf.gz
    13:16:37.944 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode.v19.annotation.REORDERED.gtf -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode/hg19/gencode.v19.annotation.REORDERED.gtf
    13:16:37.944 INFO  DataSourceUtils - Setting lookahead cache for data source: Gencode : 100000
    13:16:37.982 INFO  FeatureManager - Using codec GencodeGtfCodec to read file file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode/hg19/gencode.v19.annotation.REORDERED.gtf
    13:16:37.992 INFO  DataSourceUtils - Resolved data source file path: file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/gencode.v19.pc_transcripts.fa -> file:///gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s/gencode/hg19/gencode.v19.pc_transcripts.fa
    13:16:42.038 INFO  Funcotator - Initializing Funcotator Engine...
    13:16:42.044 WARN  FuncotatorEngine - WARNING: You are using B37 as a reference.  Funcotator will convert your variants to GRCh37, and this will be fine in the vast majority of cases.  There MAY be some errors (e.g. in the Y chromosome, but possibly in other places as well) due to changes between the two references.
    13:16:42.044 INFO  Funcotator - Creating a MAF file for output: file:/gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/execution/NA12878_sim_tumor-filtered.vcf.gz.maf.annotated
    13:16:42.059 INFO  ProgressMeter - Starting traversal
    13:16:42.060 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
    13:18:04.095 INFO  ProgressMeter -        chr6:57550094              1.4                  1000            731.4
    13:19:31.446 INFO  ProgressMeter -      chr14:106539858              2.8                  2000            708.4
    13:20:18.985 INFO  ProgressMeter -      chr14:106539858              3.6                  2626            726.3
    13:20:18.985 INFO  ProgressMeter - Traversal complete. Processed 2626 total variants in 3.6 minutes.
    log4j:WARN No appenders could be found for logger (org.broadinstitute.hellbender.tools.funcotator.dataSources.vcf.VcfFuncotationFactory).
    log4j:WARN Please initialize the log4j system properly.
    log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
    13:20:18.987 INFO  Funcotator - Shutting down engine
    [March 3, 2019 1:20:18 PM CST] org.broadinstitute.hellbender.tools.funcotator.Funcotator done. Elapsed time: 3.76 minutes.
    Runtime.totalMemory()=2047344640
    Using GATK jar /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/-1075985506/gatk.jar defined in environment variable GATK_LOCAL_JAR
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx2000m -jar /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/-1075985506/gatk.jar Funcotator --data-sources-path /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/1004526586/funcotator_dataSources.v1.6.20190124s --ref-version hg19 --output-file-format MAF -R /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/400411378/Homo_sapiens_assembly19.fasta -V /gpfs/data/software/cromwell/log/cromwell-executions/Mutect2/a96866a5-e003-4732-a4ba-0ddef4455ff3/call-FuncotateMaf/inputs/117600588/NA12878_sim_tumor-filtered.vcf.gz -O NA12878_sim_tumor-filtered.vcf.gz.maf.annotated --transcript-selection-mode CANONICAL --annotation-default normal_barcode:NA12878 --annotation-default tumor_barcode:NA12878,sim,tumor --annotation-default Center:CPGM-CMH --annotation-default source:Unknown --remove-filtered-variants
    
  • dayzcooldayzcool Member
    edited March 5

    @bshifaw Thank you for your help!
    There are variants annotated with those (e.g. INTRON) by Funcotator. And, some data sources look fine.
    FYI, here's a table from the variant classification annotated by Funcotator.

    Variant Classification Freq
    3'UTR 105
    5'Flank 622
    5'UTR 10
    Frame_Shift_Del 4
    Frame_Shift_Ins 2
    IGR 5848
    In_Frame_Ins 1
    Intron 5626
    Missense_Mutation 30
    Nonsense_Mutation 2
    RNA 1027
    Silent 12
    Splice_Site 11

    Below are # of unique values in working annotations.

    Annotation # of unique value
    Hugo_Symbol 4609
    Variant_Classification 13
    Variant_Type 6
    dbSNP_RS 1965
    Annotation_Transcript 4795
    Transcript_Exon 34
    Transcript_Position 54
    cDNA_Change 6712
    Codon_Change 61
    Protein_Change 52
    Other_Transcripts 3849
    OREGANNO_ID 874
    OREGANNO_Values 825
    dbSNP_CAF 283
    dbSNP_GENEINFO 701
    dbSNP_TOPMED 126
    dbSNP_ID 1965
  • bshifawbshifaw Member, Broadie, Moderator admin

    Would you also send the VCF header and two or 3 variants in a file that we can take a look at? Variants that are of Variant_Classification = Nonsense_Mutation, Missense_Mutation, In_Frame_Ins, Frame_Shift_Ins, or Frame_Shift_Del it would be best

  • dayzcooldayzcool Member

    @bshifaw Here's the maf output of another sample easier to share. It has three variants annotated as Missense_Mutation. Please let me know if it doesn't help.

  • dayzcooldayzcool Member

    @bshifaw Thank you for the quick help! I tried the master branch and it works a whole lot better. I still don't get UniProt or ClinVar at all for a test sample, though I didn't test it properly. I just found it odd seeing thousands of SwissProt yet no UniProt.

  • bshifawbshifaw Member, Broadie, Moderator admin

    Hi @dayzcool ,

    If you tested the same data you could share your output again so that we could compare.

  • dayzcooldayzcool Member

    Attached is the output of the Funcotator built from master.
    Thanks for your help, @bshifaw.

Sign In or Register to comment.