This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Feedback on approach to create a custom truth set for VQSR
I would like to ask you for feedback on my approach to construct a truth set, since there is no such resource for my species.
What I am doing is to:
1/ call variants with GATK best practices by joint calling with
2/ call variants with another caller (
3/ Filter each set by retaining sites in which all samples have a depth of at least 10 (
DP>=10) and a genotype quality of 30 (
GQ>=30) in the
4/ Use retained sites common between both callers as truth set for VQSR
My reasoning was that sites called by two different algorithms having a
DP>=10 in all samples of the cohort are very likely to be truth, and their annotations can be used to learn the rules of what a good variant looks like.
I would like to know if my reasoning makes sense to you and if so, what would you suggest me to change/add/remove (for example, I am not completely convinced about retaining sites if all samples have the min GQ and DP, what about if only one sample passes the condition?).
I greately appreciate your feedback and thanks in advance!