If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
A way to come up with "truth set" to use VQSR
Dear GATK Team,
I have a question regarding finding cutoffs for hard filtering. I am working with yeast for which we do not have a good true variation set. I am following the best practices and have done the joint genotyping of my samples. To give some idea, my samples are yeast clones isolated from a population at different time points. I was wondering if I can select a subset of variants which are shared amongst more than 2 samples (and thus, more likely to be correct) to use as my "truth set", and thus, use VQSR pipeline instead. Am I doing something obviously wrong with this approach?