If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

MethylationTypeCaller for analyzing Bisulphite sequencing data

SystemSystem Administrator admin
This discussion was created from comments split from: New to the forum? Ask your questions here!.


  • guoxyguoxy bostonMember


    I am trying to use the MethylationTypeCaller for analyzing Bisulphite sequencing data. I wonder how the UNCONVERTED_BASE_COV is calculated in the output vcf file. Also, the context is written in the convetional CpG CHG CHH format? meaning the first C is the methylated cytosine? Also, the REF is not limited to C, I wonder why so? Does it mean the methylation and bisulphite conversion also happened on other bases?


  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited June 10

    Hi @guoxy

    This is separate question from the thread you posted in so I have moved it to a new thread. Please try to start new threads as much as possible.
    I am looking into your question and will get back to you shortly.

  • bcarlinbcarlin Broad InstituteMember
    This tools is mirroring the logic from creating ALLC files. Please see here for more information....

    For each site, UNCONVERTED_BASE_COV is calculated by counting the number of bases that are not methylated. For example, if the reference base is a C, it is number of bases that are C's and if the reference base is a G it is the number of bases that are G's.

    The context is written by determining the reference base and next 2 following reference bases. For example, if the reference base is a C, the context is that reference base and the leading 2 reference bases. However, if the reference base is a G, the context is the reverse complement of that reference base and the trailing 2 reference bases.

    For the reference bases, it is methylation for G's and C's, being converted to A's and T's respectively.
Sign In or Register to comment.