How does MAPQ affect haplotypecaller during germline calling

I understand that, during germline calling, pairHMM calculates the likelihood of each haplotype by taking base quality into consideration.

My question is: what role does MAPQ play during germline calling?

For example in low mapperable region, we may get alignment with multiple mismatch within a short window or alignment of multiple hits. these situation of low MAPQ alignment should be considered by haplotypecaller, right? Can anyone share some insight on this?


  • UniCornUniCorn USMember

    Can anyone share some insights on this? According to the HC documentation, base quality is taken into consideration during pair-HMM. I suppose region with multiple mismatch gets low likelihood for all possible allele (say AA, AC and CC at given site). However, I didn't find a minimum threshold of likelihood be set. So after normalization, the allele with highest likelihood still get emitted? Or I missed something that prevent this from happening?

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @UniCorn ,

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, see this announcement and check out our support policy.

  • UniCornUniCorn USMember

    Sorry for that. Actually this is HaplotypeCaller specific. please allow me to rephrase my question. Also, please let me know if I need to open this question in a new post.

    As far as my understanding, HaplotypeCaller uses base quality during pair-HMM construction. I have limited understanding of HMM and feel confused about how the base quality is integrated into HMM. My guess is that the base quality affect either the initial sequence of hidden state or the emmission probability of the each allele. But again I guess I am wrong.

