VariantRecalibrator parameter setting

Jiwoong_KIMJiwoong_KIM Posts: 3Member

I have HiSeq exome data, and using GATK v.2.5 While trying to do variant recalibration, I had got an error with default for -percentBad and --maxGaussians. Searching the forum, according to the tip that suggested to loosen those, increasing the first to 0.05 or decreasing the second to 4, the walker worked well. Actually either have been enough for one case of my data. However, for an other data, it finally worked when both were adjusted. Even, another case is being tested with more generous setting. The options that I have controlled are below: -minNumBad , -percentBad , --maxGaussians

What I wonder for options is, 1. Appropriate values could certainly differ sample by sample? ( Sometimes it's natural to try and adjust? ) 2. Are there known values with the most generous level to keep reasonable performance? ( To what extent is it safe to loose the values? )

Any comment much appreciated. Let me know if I missed some information. KIM

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi there,

    1. For samples that you compare with each other, you need to use the same values -- in fact you should call variants on them together then recalibrate the variants together. But if you're dealing with different cohorts of samples, then yes it's ok to adapt settings.

    2. Our Best Practices recommendations represent the optimal tradeoff, and each degree of loosening weakens the power of the model. Depending on your data the model will be more or less robust to this. You'll need to experiment to find the right settings for your data.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.