Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

BQSR for RNA-seq


I am performing BQSR on RNA-seq data for the purpose of SNP calling. I was wondering about some issues:

  1. My organism does not have a known set of SNPs. I asked this question before in the forum and accordingly, I am using a set of SNPs filtered as the input of knownsites for BQSR. In filtering, some SNPs have a tag 'SNPcluster', shall I use this file or shall I somehow filter the file so only the PASS SNPs are retained?

  2. I have performed SNP calling only on a subset of my bam file because I was interested in certain chromosomes. Would that be fine if I only use this subset of SNPs and try to recalibrate only the subset of bam file not the entire bam? I am asking this question in case this can somehow create a bias for final results.

Thanks in advance for your help!


Best Answer


  • swongswong Phoenix, AZMember

    Hello, I am also performing BQSR on RNA-seq data for SNP calling. In the BQSR tutorial 2 passes to analyze covariation is performed to generate a recal_data.table and a post_recal_data.table. Which one would I use for the applying recalibration to my RNASeq data? I am assuming I should use the post_recal_data.table. Am I correct in my thinking? Thank you so much!

  • swongswong Phoenix, AZMember

    Actually, let me be more specific. Is the post_recal_data.table only used for generating the before and after plot? In the BQSR tuturial it is not mentioned again. Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    edited June 2015

    @swong That's right, the post recal table is only for plotting/QC purposes. The first recal table is what you should use to do the recalibration.

Sign In or Register to comment.