Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Calling Transcriptome SNP with GATK

edge_dinersedge_diners Posts: 9Member
edited January 2013 in Ask the team

Hi,

As I know GATK is worked fine for Genome SNP Calling. Can I know it work well for Transcriptome SNP calling as well? I can't find much info regarding Transcriptome SNP calling.

If yes, can I know the step for Transcriptome SNP calling by GATK is same as what we did for Genome SNP calling? eg. alignment raw read to reference transcriptome, marking/remove PCR duplicates, local realignment around indel and quality score re-calibration (if know dbSNP is available).

Thanks and looking forward to hear from you.

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,268Administrator, GSA Member admin

    Hi there,

    I've transferred your question to the Community section, as we don't have the experience to address it. Hopefully someone from our user community will be able to contribute their expertise on this topic...

    Geraldine Van der Auwera, PhD

  • edge_dinersedge_diners Posts: 9Member

    @Geraldine_VdAuwera said: Hi there,

    I've transferred your question to the Community section, as we don't have the experience to address it. Hopefully someone from our user community will be able to contribute their expertise on this topic...

    thanks, geraldine. Many appreciate for your advice. As I know, most of the research is focused on SNP calling of genome data set. I just wonder whether GATK work fine for transcriptome data set as well? Thanks and looking forward to hear from your community team.

  • edge_dinersedge_diners Posts: 9Member

    Hi Geraldine,

    Do you have any idea regarding should we remove or mark duplicate (using Picard) after my BWA alignments if my data set is transcriptome?

    I still don't get any feedback from community team yet :( Thanks for your advice.

    @Geraldine_VdAuwera said: Hi there,

    I've transferred your question to the Community section, as we don't have the experience to address it. Hopefully someone from our user community will be able to contribute their expertise on this topic...

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,268Administrator, GSA Member admin

    I'm sorry but we really have no experience with this. If you can't find anyone to answer this for you, you'll need to try both ways and see what is the difference between the resulting callsets.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,268Administrator, GSA Member admin

    Since the "Ask the Community" setup hasn't been working for this type of question, we're going to try a different approach. We are going to set up a Special Interest Group on using GATK with RNA-seq / transcriptome data. This will aim to concentrate expertise from users in the community who have experience with rnaseq data. Stay tuned for details...

    Geraldine Van der Auwera, PhD

  • haojamhaojam Posts: 3Member

    I have 20 samples RICE RNA-seq transcriptome data. I have aligned BWA and remove PCR duplicates using PICARD. Could you please send me optimized filtered parameters to run GATK for calling SNPs. I would be glad and highly appreciate for your kindness and support.

    With reagrds, Rocky Singh

Sign In or Register to comment.