The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

Question about the alignment performed in the Haplotypecaller (pairHMM)

rwkrwk Member
edited December 2018 in Ask the GATK team

After looking at the implementation of the pair-HMM algorithm used in the haplotype caller I was left with a question.
From the source code comments, it seems that local alignment is supposed to be used.
link to the comment on github : (note : the links are missing the 'h' of https because I am not allowed to post links for now).

This cites Figure 4.3 in Durbin 1998 book. This is the figure of the Finite State Automaton (FSA) for local alignment but when reading the implementation (of the forward algorithm) it seems that the FSA of figure 4.2 is used instead. (The model with M, X, and Y).

The forward algorithm used seems to be the one presented in section 4.2 of the book which implements Figure 4.2 (and not 4.3). This algorithm is for global alignment and not local alignment.

My question is : which alignment is actually computed ? (score of the best alignment) local or global ?

I cannot see the random model RX/RY states (and associated transition probabilities) present in the local alignment FSA of figure 4.3 in the implementation. I may simply have missed them, if so could you please point them out to me ?

I'd like to be sure, because the difference between the two FSAs is as important as the difference between running the Needleman–Wunsch and Smith-Waterman algorithm (global vs local).

Thank you for your insights.
Regards. Rick
Post edited by shlee on

Best Answer


  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    I am not positive about the answer so I am going to confer with the team and get back to you!

  • rwkrwk Member
    Hello, thank you for your reply.
    The algorithm is clear to me, from what I read of the code it is effectively the right of figure 4.1 (or 4.2 without the start and end states for simplicity) that is used and not 4.3

    Therefore my concern, the comments in the source code clearly state that 4.3 is used and that it is local alignment, while the code in fact does global alignment.

    It makes more sense to do global alignment (sequence to sequence, like Needleman-Wunsch) and this is what the codes seems to do (does).

    Thank you for your answer.

    Issue · Github
    by shlee

    Issue Number
    Last Updated
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @rwk, I have filed a request for clarification at, but I see you have also issued a pull request on the same at I have asked the developers to follow up. Thank you for your contributions.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @rwk @shlee It seems to me like a very understandable confusion: it's local in the sense that reads are aligned to artificial haplotypes only within a small window defined by the assembly region, but within that small window it is, in fact, global alignment as you have both concluded.

    Also, we have docs:

Sign In or Register to comment.