Could I run ASEReadCounter on homozygous SNPs?

siriansirian USMember
edited January 24 in Ask the GATK team

The documentation of ASEReadCounter states that this tool is designed for heterozygous SNPs. However, could I still use it to calculate ref and alt allele read depth on hom-SNPs? My purpose is to check consistency between RNA-Seq and DNA-Seq samples from the same individuals, in order to identify potential contamination in RNA samples.

Tagged:

Answers

  • siriansirian USMember

    Any help? Thanks.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @sirian
    Hi,

    The tool is indeed used for het sites. Is there a reason you cannot use the depth from HaplotypeCaller or other coverage tools for hom var sites?

    -Sheila

  • siriansirian USMember
    edited February 6

    Because (1) I want to use the "COUNT_FRAGMENTS_REQUIRE_SAME_BASE" function. I also used bam-readcount, but it does not have this function. (2) I'm using it to count depth from RNA-Seq alignment at SNPs called from DNA-seq samples. HaplotypeCaller is not used for this.

    When it counts the read depth, does it consider whether it's hom or het in any way? What would go wrong if I use it for hom sites?

    I compared the counts on a few hom sites using bam-readcount and ASEReadCounter. They are quite similar, so I thought the small difference must be from the different ways they handle mapping/base quality scores, the unique "COUNT_FRAGMENTS_REQUIRE_SAME_BASE" function in ASEReadCounter and whether there is downsampling or not.

    So I wonder why ASEReadCounter documentation emphasizes it's used for only het sites. I myself didn't see anything wrong with the counts on hom sites using ASEReadCounter, but if I miss any important information, please let me know. Thanks.

    Post edited by sirian on
  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @sirian
    Hi,

    The tool is supposed to check for allele specific expression differences. It cannot check for that if there is only one allele in the genotype. That is why it runs only on het sites.

    So, you want to determine the coverage at each of the variant sites (both het and hom var)? If so, perhaps you can use one of the other coverage analysis tools. There are a few tools in GATK4 that may interest you.

    If not, you will have to check outside of GATK to find some tools that do what you want.

    -Sheila

Sign In or Register to comment.