Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

Question about the alignment performed in the Haplotypecaller (pairHMM)

rwkrwk Member
edited December 2018 in Ask the GATK team
Hello.

After looking at the implementation of the pair-HMM algorithm used in the haplotype caller I was left with a question.
From the source code comments, it seems that local alignment is supposed to be used.
link to the comment on github : (note : the links are missing the 'h' of https because I am not allowed to post links for now).
* https://github.com/broadinstitute/gatk/blob/67f0f0f2e59185b721398b17c24eba487a2ac76c/src/main/java/org/broadinstitute/hellbender/utils/pairhmm/PairHMM.java#L23
* https://github.com/broadinstitute/gatk/blob/67f0f0f2e59185b721398b17c24eba487a2ac76c/src/main/java/org/broadinstitute/hellbender/utils/pairhmm/Log10PairHMM.java#L11

This cites Figure 4.3 in Durbin 1998 book. This is the figure of the Finite State Automaton (FSA) for local alignment but when reading the implementation (of the forward algorithm) it seems that the FSA of figure 4.2 is used instead. (The model with M, X, and Y).
* https://github.com/broadinstitute/gatk/blob/67f0f0f2e59185b721398b17c24eba487a2ac76c/src/main/java/org/broadinstitute/hellbender/utils/pairhmm/PairHMMModel.java

The forward algorithm used seems to be the one presented in section 4.2 of the book which implements Figure 4.2 (and not 4.3). This algorithm is for global alignment and not local alignment.

My question is : which alignment is actually computed ? (score of the best alignment) local or global ?

I cannot see the random model RX/RY states (and associated transition probabilities) present in the local alignment FSA of figure 4.3 in the implementation. I may simply have missed them, if so could you please point them out to me ?

I'd like to be sure, because the difference between the two FSAs is as important as the difference between running the Needleman–Wunsch and Smith-Waterman algorithm (global vs local).


Thank you for your insights.
Regards. Rick
Post edited by shlee on

Best Answer

Answers

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    @rwk
    I am not positive about the answer so I am going to confer with the team and get back to you!

  • rwkrwk Member
    Hello, thank you for your reply.
    The algorithm is clear to me, from what I read of the code it is effectively the right of figure 4.1 (or 4.2 without the start and end states for simplicity) that is used and not 4.3

    Therefore my concern, the comments in the source code clearly state that 4.3 is used and that it is local alignment, while the code in fact does global alignment.

    It makes more sense to do global alignment (sequence to sequence, like Needleman-Wunsch) and this is what the codes seems to do (does).

    Thank you for your answer.

    Issue · Github
    by shlee

    Issue Number
    5529
    State
    closed
    Last Updated
    Assignee
    Array
    Closed By
    droazen
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @rwk, I have filed a request for clarification at https://github.com/broadinstitute/gatk/issues/5529, but I see you have also issued a pull request on the same at https://github.com/broadinstitute/gatk/pull/5528. I have asked the developers to follow up. Thank you for your contributions.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @rwk @shlee It seems to me like a very understandable confusion: it's local in the sense that reads are aligned to artificial haplotypes only within a small window defined by the assembly region, but within that small window it is, in fact, global alignment as you have both concluded.

    Also, we have docs: https://github.com/broadinstitute/gatk/blob/master/docs/pair_hmm.pdf.

Sign In or Register to comment.