To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

INDEL + VariantRecalibrator "Training with very few variant sites"

VariantRecalibrator seems to fail for my haloplex dataset because, as far as I understand, there is not enough indels in my dataset.

WARN  16:06:52,414 VariantDataManager - WARNING: Training with very few variant sites! Please check the model reporting PDF to ensure the quality of the model is reliable. 
INFO  16:06:52,420 GaussianMixtureModel - Initializing model with 100 k-means iterations... 
INFO  16:06:52,482 VariantRecalibratorEngine - Finished iteration 0. 
INFO  16:06:52,506 VariantRecalibratorEngine - Finished iteration 5.    Current change in mixture coefficients = 0.01625 
INFO  16:06:52,514 VariantRecalibratorEngine - Finished iteration 10.   Current change in mixture coefficients = 0.00691 
INFO  16:06:52,522 VariantRecalibratorEngine - Finished iteration 15.   Current change in mixture coefficients = 0.02285 
INFO  16:06:52,529 VariantRecalibratorEngine - Finished iteration 20.   Current change in mixture coefficients = 0.00935 
INFO  16:06:52,534 VariantRecalibratorEngine - Convergence after 24 iterations! 
INFO  16:06:52,539 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000. 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
org.broadinstitute.gatk.utils.exceptions.ReviewedGATKException: Unable to retrieve result
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(

Of courses, this breaks my workflow :-)

Would it be possible to generate a 'mock' recalFile that would tell ApplyRecalibration:

"there is no data to recalibrate but write a VCF anyway".

What would the recalFile look like ?



Best Answer


Sign In or Register to comment.