Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

VariantRecalibrator parameter question

For the VariantRecalibrator program, there is an option "--trust-all-polymorphic". The documentation says

"Trust that all the input training sets' unfiltered records contain only polymorphic sites to drastically speed up the computation."

What I'm trying to figure out is whether this means that the sites in the training dataset are polymorphic in the training set or in the test set. For example, I have a set of data I'm using as my training dataset (not human data). I've filtered it to a set of sites that I am confident in, and would like to use as my training set. Within this set, all those sites are polymorphic.
I have a test set of data, with different individuals, which I would like to filter. In this test set, some of the sites identified in the training set will be polymorphic, but some will not be. In this case, should I set --trust-all-polymorphic to TRUE?

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Greg_Owens
    Hi,

    What I'm trying to figure out is whether this means that the sites in the training dataset are polymorphic in the training set or in the test set.

    Polymorphic in the training data (the resource files).

    In this test set, some of the sites identified in the training set will be polymorphic, but some will not be. In this case, should I set --trust-all-polymorphic to TRUE?

    If some sites in your training set (resource file) are not polymorphic, you should not set the argument to TRUE.

    -Sheila

Sign In or Register to comment.