How to calculate threshold for normal_LOD
I am trying to understand the thresholding used to detect tumor and normal variants in the MuTect algorithm. I read the Cibulskis et al paper, which helped me understand how the tumor threshold value was derived. But I am still struggling to understand the cut-off for detecting germ-line variants.
As I understand it, the two models MuTect considers are:
(1) A variant (either in dbSNP or not) exists at the site in the germ-line or,
(2) No variant exists in the germ-line, and all non-reference alleles are sequencing errors.
For model (1), the prior probability of a germ-line variant at the site is the rate of SNPs in a typical genome. The paper says that SNPs occur at ~1000/Mb; 5%, or 50 /Mb are SNPs not in dbSNP, and the rest (950/Mb) are in dbSNP.
The default normal_LOD threshold used by MuTect for calling a site non-variant in the normal file is 2.2 (if the alternate allele is a non-dbSNP site), but I am getting a negative value
when I plug in 0.00005 for P(germline variant, non-dbSNP site) , 0.999 for P(no variant), and 10 for delta_N (for how many times greater must be the posterior probabilities of P(no variant) to P(germline variant) into the expression for normal_LOD,
log10 (delta_N) - log10 ( P(no variant) / P(germline variant) )
Can you explain where I'm going wrong?
Thanks a lot,