ERROR: GATK VQSR fails to identify top worst variants and terminates

michalkcmichalkc BaselMember
edited April 2017 in Ask the GATK team

Hi,

I've been using GATK's VQSR to my satisfaction in multiple projects, however, today have encountered the same failure on multiple seemingly normal exomes.

Here's the log.

INFO 16:01:37,398 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:01:37,401 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
INFO 16:01:37,401 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO 16:01:37,402 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
INFO 16:01:37,402 HelpFormatter - [Thu Apr 13 16:01:37 CEST 2017] Executing on Mac OS X 10.11.6 x86_64
INFO 16:01:37,402 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_66-b17
INFO 16:01:37,409 HelpFormatter - Program Args: -T VariantRecalibrator -R /Users/michalkovac/Documents/Data/Genomes/hs37/hs37d5.fa -input temp.inputForVQSR.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /Users/michalkovac/Documents/Data/GATK.3.7/Resources/hapmap_3.3.b37.vcf -resource:omni,known=false,training=true,truth=false,prior=12.0 /Users/michalkovac/Documents/Data/GATK.3.7/Resources/1000G_omni2.5.b37.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 /Users/michalkovac/Documents/Data/GATK.3.7/Resources/1000G_phase1.snps.high_confidence.b37.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /Users/michalkovac/Documents/Data/GATK.3.7/Resources/dbsnp_138.b37.vcf -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -mode SNP -recalFile temp.output.recal -tranchesFile temp.output.tranches -rscriptFile temp.output.plots.R
INFO 16:01:37,421 HelpFormatter - Executing as [email protected] on Mac OS X 10.11.6 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_66-b17.
INFO 16:01:37,422 HelpFormatter - Date/Time: 2017/04/13 16:01:37
INFO 16:01:37,423 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:01:37,424 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:01:37,488 GenomeAnalysisEngine - Strictness is SILENT
INFO 16:01:37,758 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 16:01:38,413 GenomeAnalysisEngine - Preparing for traversal
INFO 16:01:38,424 GenomeAnalysisEngine - Done preparing for traversal
INFO 16:01:38,425 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 16:01:38,426 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 16:01:38,426 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 16:01:38,435 TrainingSet - Found hapmap track: Known = false Training = true Truth = true Prior = Q15.0
INFO 16:01:38,436 TrainingSet - Found omni track: Known = false Training = true Truth = false Prior = Q12.0
INFO 16:01:38,437 TrainingSet - Found 1000G track: Known = false Training = true Truth = false Prior = Q10.0
INFO 16:01:38,437 TrainingSet - Found dbsnp track: Known = true Training = false Truth = false Prior = Q2.0
INFO 16:02:08,436 ProgressMeter - 1:84540181 1966180.0 30.0 s 15.0 s 2.7% 18.6 m 18.1 m
INFO 16:02:38,444 ProgressMeter - 1:195143610 3960094.0 60.0 s 15.0 s 6.2% 16.1 m 15.1 m
INFO 16:03:08,448 ProgressMeter - 2:26868505 5909693.0 90.0 s 15.0 s 8.8% 17.0 m 15.5 m
INFO 16:03:38,458 ProgressMeter - 2:115213366 7904066.0 120.0 s 15.0 s 11.6% 17.2 m 15.2 m
INFO 16:04:08,464 ProgressMeter - 2:199266705 9810439.0 2.5 m 15.0 s 14.3% 17.5 m 15.0 m
INFO 16:04:38,471 ProgressMeter - 3:16790184 1.1301028E7 3.0 m 15.0 s 16.2% 18.5 m 15.5 m
INFO 16:05:08,491 ProgressMeter - 3:115065082 1.3552096E7 3.5 m 15.0 s 19.4% 18.1 m 14.6 m
INFO 16:05:38,501 ProgressMeter - 4:28737000 1.6247471E7 4.0 m 14.0 s 22.9% 17.4 m 13.4 m
INFO 16:06:08,511 ProgressMeter - 4:146958954 1.8927968E7 4.5 m 14.0 s 26.7% 16.9 m 12.4 m
INFO 16:06:38,524 ProgressMeter - 5:67243590 2.1600539E7 5.0 m 13.0 s 30.2% 16.5 m 11.5 m
INFO 16:07:08,536 ProgressMeter - 6:2795421 2.4236732E7 5.5 m 13.0 s 34.0% 16.2 m 10.7 m
INFO 16:07:38,542 ProgressMeter - 6:115845548 2.6971222E7 6.0 m 13.0 s 37.6% 16.0 m 10.0 m
INFO 16:08:08,551 ProgressMeter - 7:51085109 2.9635051E7 6.5 m 13.0 s 40.9% 15.9 m 9.4 m
INFO 16:08:38,558 ProgressMeter - 8:8032017 3.2452611E7 7.0 m 12.0 s 44.6% 15.7 m 8.7 m
INFO 16:09:08,563 ProgressMeter - 8:118075708 3.4993577E7 7.5 m 12.0 s 48.2% 15.6 m 8.1 m
INFO 16:09:38,570 ProgressMeter - 9:85403943 3.7213416E7 8.0 m 12.0 s 51.8% 15.5 m 7.5 m
INFO 16:10:08,577 ProgressMeter - 10:25212068 3.9237538E7 8.5 m 13.0 s 54.4% 15.6 m 7.1 m
INFO 16:10:38,582 ProgressMeter - 10:111330790 4.116506E7 9.0 m 13.0 s 57.1% 15.8 m 6.8 m
INFO 16:11:08,594 ProgressMeter - 11:55139454 4.309365E7 9.5 m 13.0 s 59.6% 15.9 m 6.4 m
INFO 16:11:38,601 ProgressMeter - 11:132482836 4.4946494E7 10.0 m 13.0 s 62.1% 16.1 m 6.1 m
INFO 16:12:08,608 ProgressMeter - 12:89179005 4.7089397E7 10.5 m 13.0 s 65.0% 16.1 m 5.6 m
INFO 16:12:38,616 ProgressMeter - 13:75456448 4.9500602E7 11.0 m 13.0 s 68.9% 16.0 m 5.0 m
INFO 16:13:08,622 ProgressMeter - 14:85460308 5.2019543E7 11.5 m 13.0 s 72.8% 15.8 m 4.3 m
INFO 16:13:38,637 ProgressMeter - 16:4779460 5.4670212E7 12.0 m 13.0 s 77.0% 15.6 m 3.6 m
INFO 16:14:08,643 ProgressMeter - 17:30999810 5.7418502E7 12.5 m 13.0 s 80.7% 15.5 m 3.0 m
INFO 16:14:38,646 ProgressMeter - 18:51319544 5.9790991E7 13.0 m 13.0 s 83.9% 15.5 m 2.5 m
INFO 16:15:08,649 ProgressMeter - 20:22167098 6.2610802E7 13.5 m 12.0 s 87.4% 15.5 m 117.0 s
INFO 16:15:38,655 ProgressMeter - 22:48024220 6.5424223E7 14.0 m 12.0 s 91.7% 15.3 m 75.0 s
INFO 16:16:08,661 ProgressMeter - Y:13306201 6.8127229E7 14.5 m 12.0 s 97.2% 14.9 m 25.0 s
INFO 16:16:09,283 VariantDataManager - QD: mean = 21.04 standard deviation = 9.18
INFO 16:16:09,293 VariantDataManager - MQ: mean = 59.76 standard deviation = 1.74
INFO 16:16:09,303 VariantDataManager - MQRankSum: mean = -0.13 standard deviation = 1.27
INFO 16:16:09,311 VariantDataManager - ReadPosRankSum: mean = -0.01 standard deviation = 1.04
INFO 16:16:09,316 VariantDataManager - FS: mean = 1.69 standard deviation = 4.10
INFO 16:16:09,320 VariantDataManager - SOR: mean = 0.99 standard deviation = 0.64
INFO 16:16:09,372 VariantDataManager - Annotations are now ordered by their information content: [MQ, QD, MQRankSum, SOR, FS, ReadPosRankSum]
INFO 16:16:09,377 VariantDataManager - Training with 18382 variants after standard deviation thresholding.
INFO 16:16:09,384 GaussianMixtureModel - Initializing model with 100 k-means iterations...
INFO 16:16:10,281 VariantRecalibratorEngine - Finished iteration 0.
INFO 16:16:10,986 VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.35625
INFO 16:16:11,557 VariantRecalibratorEngine - Finished iteration 10. Current change in mixture coefficients = 0.23760
INFO 16:16:12,178 VariantRecalibratorEngine - Finished iteration 15. Current change in mixture coefficients = 0.04780
INFO 16:16:12,778 VariantRecalibratorEngine - Finished iteration 20. Current change in mixture coefficients = 0.01785
INFO 16:16:13,385 VariantRecalibratorEngine - Finished iteration 25. Current change in mixture coefficients = 0.05647
INFO 16:16:13,988 VariantRecalibratorEngine - Finished iteration 30. Current change in mixture coefficients = 0.03181
INFO 16:16:14,601 VariantRecalibratorEngine - Finished iteration 35. Current change in mixture coefficients = 0.01707
INFO 16:16:15,199 VariantRecalibratorEngine - Finished iteration 40. Current change in mixture coefficients = 0.00936
INFO 16:16:15,791 VariantRecalibratorEngine - Finished iteration 45. Current change in mixture coefficients = 0.00826
INFO 16:16:16,307 VariantRecalibratorEngine - Finished iteration 50. Current change in mixture coefficients = 0.00939
INFO 16:16:16,838 VariantRecalibratorEngine - Finished iteration 55. Current change in mixture coefficients = 0.01479
INFO 16:16:17,366 VariantRecalibratorEngine - Finished iteration 60. Current change in mixture coefficients = 0.00670
INFO 16:16:17,917 VariantRecalibratorEngine - Finished iteration 65. Current change in mixture coefficients = 0.01879
INFO 16:16:18,446 VariantRecalibratorEngine - Finished iteration 70. Current change in mixture coefficients = 0.00641
INFO 16:16:19,035 VariantRecalibratorEngine - Finished iteration 75. Current change in mixture coefficients = 0.01789
INFO 16:16:19,586 VariantRecalibratorEngine - Finished iteration 80. Current change in mixture coefficients = 0.01082
INFO 16:16:20,145 VariantRecalibratorEngine - Finished iteration 85. Current change in mixture coefficients = 0.06470
INFO 16:16:20,697 VariantRecalibratorEngine - Finished iteration 90. Current change in mixture coefficients = 0.00295
INFO 16:16:20,814 VariantRecalibratorEngine - Convergence after 91 iterations!
INFO 16:16:20,936 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000.

ERROR --
ERROR stack trace

java.lang.IllegalArgumentException: No data found.
at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:88)
at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:489)
at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:185)
at org.broadinstitute.gatk.engine.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:115)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:316)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://software.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: No data found.
ERROR ------------------------------------------------------------------------------------------

Any idea what went wrong? To my knowledge this is not a known problem.
Cheers,
Michal

Tagged:

Answers

Sign In or Register to comment.