Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

about pairHMM transition probabilities calculate

chengjunwenhhhchengjunwenhhh chinaMember
edited April 2017 in Ask the GATK team

i have a doubt about the pairHMM transition probabilities calculating in common_data_structure.h (167 line), as I known , the quality value is less than 128, MAX_QUAL is 254 , so why your codes judge about MAX_QUAL<maxQual ,is there any outlier that the quality will be large than 254

#define SET_MATCH_TO_MATCH_PROB(output, insQual, delQual)                       \
{                                                                               \
  int minQual = delQual;                                                        \
  int maxQual = insQual;                                                        \
  if (insQual <= delQual)                                                       \
  {                                                                             \
    minQual = insQual;                                                          \
    maxQual = delQual;                                                          \
  }                                                                             \
  (output) = (MAX_QUAL < maxQual) ?                                             \
  ((NUMBER)1.0) - ctx.POW(((NUMBER)10), ctx.approximateLog10SumLog10(((NUMBER)-0.1)*minQual, ((NUMBER)-0.1)*maxQual))       \
  : ctx.matchToMatchProb[((maxQual * (maxQual + 1)) >> 1) + minQual];           \
}
Tagged:

Issue · Github
by Sheila

Issue Number
1960
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Answers

  • vruanovruano Member ✭✭
    edited April 2017

    As you probably know, the typical BAM cannot have mapping quality scores larger than 255 as it uses a byte to store them. 255 (-1) is used in fact to indicate the lack of mapping quality per the SAM format spec. For base quals and indels they cannot be larger than 255 - 33. So yes, based on that you would not encounter any quality value larger that 254.

    However that is not to say that it is actually impossible that the PairHMM would ever be fed a larger quality score... for example an alternative alignment format/DB read-store whose author has decided that it would support 64-bits quality scores, ... or perhaps the qualities scores fed into the pair-hmm are result of applying some mathematical model or recalibration that could incidentally probe up those values.

    So I guess that the reason for including this inlined "if-else" is a safeguard in case that happens.

Sign In or Register to comment.