Our documentation websites are currently offline due to a data center fire. We do not yet have an ETA for restoring service; we’ll update this message when we know more.

Call Varients from RNA-seq for RNA editing detection

Dear GATK staff

I would like to use the GATK tool for the detection of possible RNA editing events. I followed the RNA-seq best practice up to the variant calling step itself. There I hesitate to use the haplotype caller because I would not assume that the editing sites follow any kind of allelic ratio. Therefore I wanted to ask if it might be better to use MuTec2 at this stage? I would call it like ...

java -jar GenomeAnalysisTK.jar -T MuTect2 -R reference.fasta -I:tumor normal1.bam -dontUseSoftClippedBases -stand_call_conf 20.0 -stand_emit_conf 20.0 --dbsnp dbSNP.vcf --artifact_detection_mode -o output.normal1.vcf
java -jar GenomeAnalysisTK.jar -T CombineVariants -R reference.fasta -V output.normal1.vcf -V output.normal2.vcf -minN 2 --setKey "null" --filteredAreUncalled --filteredrecordsmergetype KEEP_IF_ANY_UNFILTERED -o MuTect2_PON.vcf

Can you comment if this is a suitable modification of the best practice in the case of RNA editing calls?


Issue · Github
by Sheila

Issue Number
Last Updated
Closed By

Best Answer


  • Thank you for your response. I'll stick to HC.

  • arkanionarkanion SingaporeMember
    edited May 2017

    Following on this, is HC optimized to call RNA editing events? What are the drawbacks to deal with RNA editing variants compared to the DNA variants when using HC? I appreciate if you can clarify me the key points I should be aware of since I am not much familiar with the background statistical modeling of HC.

  • johnmajohnma Member

    I think RNA editing is usually not identified only by using any GATK tool. The role of the GATK tools is only to give a set of positions to evaluate in downstream tools that are specifically designed for that purpose.

    Although, if your downstream program requires a BAM, and you use HC or MT, remind that both HC and MT performs indel realignment. As a result, the input BAM should not be used as the input for those downstream tools. There's a discussion on how to generate the appropriate bamout here.

  • arkanionarkanion SingaporeMember

    So are those "set of positions" good enough to start with? Since the background statistical model of HC is primarily designed to detect genomic variants rather than RNA editing events, how trustable are the variants it finds? If they are biased towards the genomic variants at the beginning, it does not help much what tools you use in the downstream.

  • fabian-naibaffabian-naibaf ViennaMember
    edited June 2017

    Just a quick follow up on that topic: back when I asked that question the first time I thought I had to come up with a own dedicated RNA editing finder. meanwhile there are some tools published especially for that purpose. please consider one of the following:

    • Wang, Jinkai, et al. "rMATS-DVR: rMATS discovery of differential variants in RNA." Bioinformatics (2017).
      * this one uses the gatk tools box

    • Picardi, Ernesto, et al. "Using REDItools to detect RNA editing events in NGS datasets." Current protocols in bioinformatics (2015): 12-12.

    • Kim, Min-su, Benjamin Hur, and Sun Kim. "RDDpred: a condition-specific RNA-editing prediction model from RNA-seq data." BMC genomics 17.1 (2016): 5.
      * the one which proved so far performing best in my hands

  • johnmajohnma Member

    @fabian-naibaf said:

    • Wang, Jinkai, et al. "rMATS-DVR: rMATS discovery of differential variants in RNA." Bioinformatics (2017).
      * this one uses the gatk tools box

    Please note this one uses UG to avoid the realignment issue I mentioned about.

    @arkanion: my opinion is those positions are reliable, since the Best Practice for RNA detects the union of germline-caused and editing-caused variations. In theory MT may be a better choice than HC because RNA editing events are not in canonical ratios--but that's a math issue that I'd rather have the people in Broad to answer.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @arkanion @johnma

    If you look in the Methods and Algorithms section, you will find a more detailed explanation of the statistics used in HaplotypeCaller.


Sign In or Register to comment.