Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Funcotator: Functional annotation out of beta

jonnjonn adminMember, Broadie, Moderator, Dev admin
edited March 18 in Announcements

A production-ready tool to predict variant function

For the past year Funcotator has been a beta tool in GATK. With this new 4.1 release, Funcotator is (finally) out of beta and ready for use in a production environment. So... what exactly is Funcotator, and why should you care?

Funcotator is a functional annotator (FUNCtional annOTATOR) that reads in variants and adds useful information about their potential effects. It’s uses range from answering the question ‘in which gene (if any) does this variant occur’ to predicting an amino acid change string for variants that result in different protein structures. Accurate functional annotation is critical to turning vast amounts of genomic data into a better understanding of protein function.

Created to be a fast, functional, and accurate annotation tool that supports the hg38 genome, many recent updates have made Funcotator more robust and more correct. Of particular note - the protein change string algorithm can now account for protein changes that do not occur at the same amino acid position as a variant (such as when deletions occur in short tandem repeats). If you have a set of variants and wish to identify the genes affected and/or the protein amino acid sequence change, or if you simply wish to cross-reference your variants with a list of variants thought to be implicated in disease - Funcotator is the tool for you.

We publish two sets of data sources to go with Funcotator (including Gencode, ClinVar, gnomAD, and more) so it can be used out of the box and with minimal effort to add annotations to either germline or somatic variants. Best of all, it can be updated by you, the user, to include your favorite annotation data sources when you annotate your VCF files (with some caveats).


“Fun” means improved user experience and data output

A huge number of bug fixes and accuracy improvements mean output is now much better and more correct than Oncotator. As an example of improved user experience, the new FuncotatorDataSourceDownloader tool enables downloading the data sources from which annotations are created directly from the command-line. It is as simple as running ./gatk FuncotatorDataSourceDownloader --somatic to get the somatic data sources (though there are more options for the tool as well).

“Funcotator” versus “Oncotator” - very different annotator tools

A savvy user may want to compare Funcotator to the Broad’s previous functional annotation tool Oncotator. Despite similar names and purpose, they are VERY different pieces of software and a direct comparison cannot really be made. Funcotator is not Oncotator. The forum post below details some of the differences between the two tools.

Future of “Fun”

There are many features on the horizon for Funcotator (in addition to normal support and bug fixes). In the long-term, we would like greatly increase performance with a Spark version of Funcotator. Adding even more supported data formats for data sources will offer users additional options to add in annotations. Since it is in active development, there are always small features being added and bugs being fixed.

Check for current progress on the GATK Github page here:
https://github.com/broadinstitute/gatk/labels/Funcotator

A forum post with a tutorial and some additional data can be found here:
https://gatkforums.broadinstitute.org/dsde/discussion/11193/funcotator-information-and-tutorial

The tool documentation for Funcotator can be found here:
https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_hellbender_tools_funcotator_Funcotator.php

Post edited by Geraldine_VdAuwera on

Comments

  • jmedaromjmedarom Member
    Hi,

    I am working with exome sequencing to find variants related to tuberculosis. What I understand from the tutorials is that Funcotator is used in cancer field. So, I would like to clarify whether it is correct to use Funcotator in my case.

    Also, I have used Variant Effector Predictor for the functional annotation before, because I like how the results are shown in this. Do you think that it is a good idea to use VEP with GATK workflows instead of Funcotator? Would I be losing something important if I do it?.

    Thank you so much for your support,

    Jenn
  • jmedaromjmedarom Member
    Hi @Sheila,

    I have been waiting for this answer so long. Could someone from gatk team give an answer to this post please?.

    Thank you for your help.



    > @jmedarom said:
    > Hi,
    >
    > I am working with exome sequencing to find variants related to tuberculosis. What I understand from the tutorials is that Funcotator is used in cancer field. So, I would like to clarify whether it is correct to use Funcotator in my case.
    >
    > Also, I have used Variant Effector Predictor for the functional annotation before, because I like how the results are shown in this. Do you think that it is a good idea to use VEP with GATK workflows instead of Funcotator? Would I be losing something important if I do it?.
    >
    > Thank you so much for your support,
    >
    > Jenn
  • Geraldine_VdAuweraGeraldine_VdAuwera admin Cambridge, MAMember, Administrator, Broadie admin

    Hi @ jmedarom, Sheila is no longer working here.

    Funcotator can be used for cancer but can also be used for many other applications that involve annotating variants based on some external datasources. It should be appropriate for your use case as well.

    We developed Funcotator because there were some things we wanted to do that VEP could not do, so we naturally prefer to use Funcotator. However we do not maintain a comparison list, so I'm not able to give you specifics. I recommend you do some testing to evaluate which tool produces the results you need in the format you prefer. Good luck!

  • joseph7ejoseph7e new hampshireMember
    Hello, I am working on constructing a data source from scratch. Is there any specific resources available for this? Specifically I am trying to index a gtf file that was produced from RefSeq for the gencode directory and getting many errors.
Sign In or Register to comment.