Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
VariantRecalibrator parameter question
For the VariantRecalibrator program, there is an option "--trust-all-polymorphic". The documentation says
"Trust that all the input training sets' unfiltered records contain only polymorphic sites to drastically speed up the computation."
What I'm trying to figure out is whether this means that the sites in the training dataset are polymorphic in the training set or in the test set. For example, I have a set of data I'm using as my training dataset (not human data). I've filtered it to a set of sites that I am confident in, and would like to use as my training set. Within this set, all those sites are polymorphic.
I have a test set of data, with different individuals, which I would like to filter. In this test set, some of the sites identified in the training set will be polymorphic, but some will not be. In this case, should I set --trust-all-polymorphic to TRUE?