The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks (  ) each to make a code block.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# How to use GATK for RNA-seq analysis?

Member Posts: 7

Hi all:
I find that among all the work flows of GATK
there are no workflows for RNA-seq analysis.
I understand that GATK mainly focuses on variant calling, can anyone tell me how to use GATK for RNA-seq analysis?

thanks
daniel

Tagged:

Hi Daniel,

We have indeed not yet formulated any best practices specific for calling variants from RNAseq data. The basic workflow should be the same as the generic Best Practices workflow, but there are probably some adaptations that need to be made at specific steps. We do not have the expertise to identify these points, but we know that some of our users have used the GATK successfully on RNAseq data. Hopefully some of them will have the time and inclination to share their experience with you here.

Geraldine Van der Auwera, PhD

• Member Posts: 7
edited June 2013

Hi Geraldine:

Actually what I mean is not calling variant from RNAseq data.
We are doing an investigation on all the tools for RNAseq analysis, which is actually the gene expression analysis instead of variant calling, such as the tophat or GenePattern that broad institute has developed.

My question is how we can use GATK in the workflow of the gene expression analysis, which typically including the following steps:

Align RNA-seq data to a reference genome--
Estimate known gene and transcript expression--
Perform differential expression analysis--
Detect expressed gene fusions--
Discover novel isoforms--
Visualize and summarize the output of RNA-seq analyses

Thanks
daniel

@Geraldine_VdAuwera said:
Hi Daniel,

We have indeed not yet formulated any best practices specific for calling variants from RNAseq data. The basic workflow should be the same as the generic Best Practices workflow, but there are probably some adaptations that need to be made at specific steps. We do not have the expertise to identify these points, but we know that some of our users have used the GATK successfully on RNAseq data. Hopefully some of them will have the time and inclination to share their experience with you here.

Oh, I see -- then I'm afraid I have to disappoint you; we don't have any tools for expression analysis in the GATK at this time.

Geraldine Van der Auwera, PhD

• Member Posts: 7

@Geraldine_VdAuwera said:
Oh, I see -- then I'm afraid I have to disappoint you; we don't have any tools for expression analysis in the GATK at this time.

• Member Posts: 7

I confused the genome expression analysis with RNAseq analysis. Is it possible to use GATK for the RNAseq analysis, like the function of RNAseq analysis in genepattern?

Thanks.
Daniel

@Geraldine_VdAuwera said:
Oh, I see -- then I'm afraid I have to disappoint you; we don't have any tools for expression analysis in the GATK at this time.

• Member Posts: 16

I think you're missing the point of GATK. It's for analysing genomic data (i.e. sequencing data generated from DNA) and sepcifically to call variants between genomes or exomes.

Performing gene expression (i.e. RNA transcripts) data analysis is completely beyond the scope of its main purpose.

• Member Posts: 7

Daniel

@drchriscole said:
I think you're missing the point of GATK. It's for analysing genomic data (i.e. sequencing data generated from DNA) and sepcifically to call variants between genomes or exomes.

Performing gene expression (i.e. RNA transcripts) data analysis is completely beyond the scope of its main purpose.

• BioforskMember Posts: 3

Do I understand correctly that SNP identification in de novo assemblies from RNAseq data would be feasible? I have a de novo transcriptome assembly generated by Trinity and would like to identify SNPs in the genotypes that this assembly is based on. Would GATK be suitable for this job?
Thanks.

Hi @JahnDavik,

The current GATK is not designed to handle RNA seq data, so while it can in theory be done to some degree, there are various pitfalls involved and we don't provide support for it. However we have been working on some new tools and methods to build in support for RNAseq data analysis, which we hope to make available to the public in the next release of GATK.

Geraldine Van der Auwera, PhD

• BioforskMember Posts: 3

When is the next release due?

We don't have a set schedule, but I think it'll be at least three to four weeks.

Geraldine Van der Auwera, PhD

• ShanghaiMember Posts: 8

not suitable for RNA-seq??

@HaoWang‌

Hi,

It is suitable for RNA-seq now. Please refer to the updated Best Practices page.

-Sheila

• ShanghaiMember Posts: 8

@Sheila said:
HaoWang‌

Hi,

It is suitable for RNA-seq now. Please refer to the updated Best Practices page.

-Sheila

@Sheila said:
HaoWang‌

Hi,

It is suitable for RNA-seq now. Please refer to the updated Best Practices page.

-Sheila

could U give me the websites plz thx.Sheila!