This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
A way to come up with "truth set" to use VQSR
Dear GATK Team,
I have a question regarding finding cutoffs for hard filtering. I am working with yeast for which we do not have a good true variation set. I am following the best practices and have done the joint genotyping of my samples. To give some idea, my samples are yeast clones isolated from a population at different time points. I was wondering if I can select a subset of variants which are shared amongst more than 2 samples (and thus, more likely to be correct) to use as my "truth set", and thus, use VQSR pipeline instead. Am I doing something obviously wrong with this approach?