VQSR sample size for truth and raw variants

timhtimh brissy, AUMember

I am wondering about the following technical detail for VQSR:
Say I have a truth dataset in which an annotations (e.g. SOR) varies from min value A to max value B.
In my raw dataset the true variants (i.e all genuine variants in the dataset) would have a broader value distribution then A->B because it has many more variants (larger sample size = broader distribution).
Now my question is, is this fact taken in to account by VQSR? e.g. If I would VQSR filter a dataset so that I call 99% of the truth-set variants, would the corresponding filtered variants have the same min/max annotation values or are these allowed to show a broader distribution in line with their sample size? If not then one would lose the true variants in extremities of the annotation distribution, a problem that would be worse with smaller truth set & larger raw dataset.
Hope that makes sense.


