Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Filtering VCF files
I have used the UnifiedGenotyper to call variants on a set of ~2400 genes (TruSeq Illumina data) from 28 different samples mapped against a preliminary draft genome. I do not have a defined set of SNPs or INDELs to use in recalibration via VQSR.
While the raw VCF has plenty of QUAL scores that are very high, not a single call has a PASS associated with it in the Filter field- all are "." If I use SelectVaraints to filter the VCF based on high QUAL or DP values, or combination, the Filter field remains "." for the returned variants.
Am I doing something wrong, or is the raw file telling me that none of the variant calls are meaningful, in spite of their high QUAL values?
Is there a "best practices" way to go about filtering such a dataset when VQSR can't be employed? If so, I haven't found it.