Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Most Variants Called <2.0 Quality of Depth (QD) in VCF files
So I am forced to do hard-filtering on my VCF files. Looking at them before filtering, ~99% of my variants have a QD of <2.0. Looking at the distribution plots in ggplot, they do not follow the same distribution pattern as seen in http://gatkforums.broadinstitute.org/gatk/discussion/6925/understanding-and-adapting-the-generic-hard-filtering-recommendations. I have 24 samples and they are not at all similar in their distribution.
The other FS and MQ are all within the recommendations. I wasn't sure where the values for ReadPosRankSum and MQRankSum were as so couldn't plot those out.
I have used a ploidy of 20, and I'm looking at a population of bacteria. Does anyone know why the QD is so low?
I'm going to reduce the recommended filtering QD cut off to 0.8, therefore, sampling the top ~10% of variants by QD. Does that seem sensible?