This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Bootstrapping high confidence variants for VQSR
I was wondering about the current best practices recommendation for refining the variant calls made by the HaplotypeCaller when no prior known variants are available (i.e. in non-model species). I can see that for base recalibration, you recommend bootstrapping a set of high confidence variants by first doing an initial round of SNP calling on your original, unrecalibrated data, and then using a high confidence subset of the called SNPs as the "known SNPs" for the base recalibration step.
Do you recommend a similar approach for variant recalibration? I have seen some people implement that, but I don't find any mention of this option in your description of the VQSR in the current best practices. Does not mentioning it there imply that you recommend to simply do a hard filtering of called variants if you don't have a database of known variants available or would you suggest that it may be worthwhile to try bootstrapping a set of "known variants" for the VQSR step as well?
Thanks very much for any advice you can share.