Holiday Notice:
The Frontline Support team will be offline February 18 for President's Day but will be back February 19th. Thank you for your patience as we get to all of your questions!

about pairHMM transition probabilities calculate

chengjunwenhhhchengjunwenhhh chinaMember
edited April 2017 in Ask the GATK team

i have a doubt about the pairHMM transition probabilities calculating in common_data_structure.h (167 line), as I known , the quality value is less than 128, MAX_QUAL is 254 , so why your codes judge about MAX_QUAL<maxQual ,is there any outlier that the quality will be large than 254

#define SET_MATCH_TO_MATCH_PROB(output, insQual, delQual)                       \
{                                                                               \
  int minQual = delQual;                                                        \
  int maxQual = insQual;                                                        \
  if (insQual <= delQual)                                                       \
  {                                                                             \
    minQual = insQual;                                                          \
    maxQual = delQual;                                                          \
  }                                                                             \
  (output) = (MAX_QUAL < maxQual) ?                                             \
  ((NUMBER)1.0) - ctx.POW(((NUMBER)10), ctx.approximateLog10SumLog10(((NUMBER)-0.1)*minQual, ((NUMBER)-0.1)*maxQual))       \
  : ctx.matchToMatchProb[((maxQual * (maxQual + 1)) >> 1) + minQual];           \
}
Tagged:

Issue · Github
by Sheila

Issue Number
1960
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Answers

  • vruanovruano Member ✭✭
    edited April 2017

    As you probably know, the typical BAM cannot have mapping quality scores larger than 255 as it uses a byte to store them. 255 (-1) is used in fact to indicate the lack of mapping quality per the SAM format spec. For base quals and indels they cannot be larger than 255 - 33. So yes, based on that you would not encounter any quality value larger that 254.

    However that is not to say that it is actually impossible that the PairHMM would ever be fed a larger quality score... for example an alternative alignment format/DB read-store whose author has decided that it would support 64-bits quality scores, ... or perhaps the qualities scores fed into the pair-hmm are result of applying some mathematical model or recalibration that could incidentally probe up those values.

    So I guess that the reason for including this inlined "if-else" is a safeguard in case that happens.

Sign In or Register to comment.