Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

VariantRecalibrator parameter question

For the VariantRecalibrator program, there is an option "--trust-all-polymorphic". The documentation says

"Trust that all the input training sets' unfiltered records contain only polymorphic sites to drastically speed up the computation."

What I'm trying to figure out is whether this means that the sites in the training dataset are polymorphic in the training set or in the test set. For example, I have a set of data I'm using as my training dataset (not human data). I've filtered it to a set of sites that I am confident in, and would like to use as my training set. Within this set, all those sites are polymorphic.
I have a test set of data, with different individuals, which I would like to filter. In this test set, some of the sites identified in the training set will be polymorphic, but some will not be. In this case, should I set --trust-all-polymorphic to TRUE?

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Greg_Owens
    Hi,

    What I'm trying to figure out is whether this means that the sites in the training dataset are polymorphic in the training set or in the test set.

    Polymorphic in the training data (the resource files).

    In this test set, some of the sites identified in the training set will be polymorphic, but some will not be. In this case, should I set --trust-all-polymorphic to TRUE?

    If some sites in your training set (resource file) are not polymorphic, you should not set the argument to TRUE.

    -Sheila

Sign In or Register to comment.