The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.4 has MAJOR CHANGES that impact throughput of pipelines. Default compression is now 1 instead of 5, and Picard now handles compressed data with the Intel Deflator/Inflator instead of JDK.
GATK version 4.beta.3 (i.e. the third beta release) is out. See the github release page for download and details.

How to use GATK for RNA-seq analysis?

Hi all:
I find that among all the work flows of GATK
http://www.broadinstitute.org/gatk/guide/topic?name=methods-and-workflows
there are no workflows for RNA-seq analysis.
I understand that GATK mainly focuses on variant calling, can anyone tell me how to use GATK for RNA-seq analysis?

thanks
daniel

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Daniel,

    We have indeed not yet formulated any best practices specific for calling variants from RNAseq data. The basic workflow should be the same as the generic Best Practices workflow, but there are probably some adaptations that need to be made at specific steps. We do not have the expertise to identify these points, but we know that some of our users have used the GATK successfully on RNAseq data. Hopefully some of them will have the time and inclination to share their experience with you here.

  • danielyindanielyin Member
    edited June 2013

    Hi Geraldine:
    Thanks very much for your prompt reply.

    Actually what I mean is not calling variant from RNAseq data.
    We are doing an investigation on all the tools for RNAseq analysis, which is actually the gene expression analysis instead of variant calling, such as the tophat or GenePattern that broad institute has developed.

    My question is how we can use GATK in the workflow of the gene expression analysis, which typically including the following steps:

    Align RNA-seq data to a reference genome--
    Estimate known gene and transcript expression--
    Perform differential expression analysis--
    Detect expressed gene fusions--
    Discover novel isoforms--
    Visualize and summarize the output of RNA-seq analyses

    Thanks
    daniel

    @Geraldine_VdAuwera said:
    Hi Daniel,

    We have indeed not yet formulated any best practices specific for calling variants from RNAseq data. The basic workflow should be the same as the generic Best Practices workflow, but there are probably some adaptations that need to be made at specific steps. We do not have the expertise to identify these points, but we know that some of our users have used the GATK successfully on RNAseq data. Hopefully some of them will have the time and inclination to share their experience with you here.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Oh, I see -- then I'm afraid I have to disappoint you; we don't have any tools for expression analysis in the GATK at this time.

  • Thanks for your reply!

    @Geraldine_VdAuwera said:
    Oh, I see -- then I'm afraid I have to disappoint you; we don't have any tools for expression analysis in the GATK at this time.

  • I confused the genome expression analysis with RNAseq analysis. Is it possible to use GATK for the RNAseq analysis, like the function of RNAseq analysis in genepattern?

    Thanks.
    Daniel

    @Geraldine_VdAuwera said:
    Oh, I see -- then I'm afraid I have to disappoint you; we don't have any tools for expression analysis in the GATK at this time.

  • I think you're missing the point of GATK. It's for analysing genomic data (i.e. sequencing data generated from DNA) and sepcifically to call variants between genomes or exomes.

    Performing gene expression (i.e. RNA transcripts) data analysis is completely beyond the scope of its main purpose.

  • Thanks very much for your reply.
    Daniel

    @drchriscole said:
    I think you're missing the point of GATK. It's for analysing genomic data (i.e. sequencing data generated from DNA) and sepcifically to call variants between genomes or exomes.

    Performing gene expression (i.e. RNA transcripts) data analysis is completely beyond the scope of its main purpose.

  • JahnDavikJahnDavik BioforskMember

    Do I understand correctly that SNP identification in de novo assemblies from RNAseq data would be feasible? I have a de novo transcriptome assembly generated by Trinity and would like to identify SNPs in the genotypes that this assembly is based on. Would GATK be suitable for this job?
    Thanks.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi @JahnDavik,

    The current GATK is not designed to handle RNA seq data, so while it can in theory be done to some degree, there are various pitfalls involved and we don't provide support for it. However we have been working on some new tools and methods to build in support for RNAseq data analysis, which we hope to make available to the public in the next release of GATK.

  • JahnDavikJahnDavik BioforskMember

    OK. Thanks for the reply.
    When is the next release due`?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    We don't have a set schedule, but I think it'll be at least three to four weeks.

  • HaoWangHaoWang ShanghaiMember

    not suitable for RNA-seq??

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @HaoWang‌

    Hi,

    It is suitable for RNA-seq now. Please refer to the updated Best Practices page.

    -Sheila

  • HaoWangHaoWang ShanghaiMember

    @Sheila said:
    HaoWang‌

    Hi,

    It is suitable for RNA-seq now. Please refer to the updated Best Practices page.

    -Sheila

    @Sheila said:
    HaoWang‌

    Hi,

    It is suitable for RNA-seq now. Please refer to the updated Best Practices page.

    -Sheila

    could U give me the websites plz thx.Sheila!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
Sign In or Register to comment.