error with VariantRecalibrator java.lang.IllegalArgumentException: No data found.

Dear GATK Team,

I have one whole genome data called with the HaplotypeCaller. I would like to apply the VariantRecalibrator to recalibrate my variant set, but I get back an error as follows:

INFO 22:05:17,683 ProgressMeter - chrY:59361069 6.7818693E7 68.2 m 60.0 s 98.7% 69.1 m 54.0 s
INFO 22:05:21,996 VariantRecalibratorEngine - Finished iteration 50. Current change in mixture coefficients = 0.00201
INFO 22:05:47,684 ProgressMeter - chrY:59361069 6.7818693E7 68.7 m 60.0 s 98.7% 69.6 m 55.0 s
INFO 22:06:01,899 VariantRecalibratorEngine - Convergence after 51 iterations!
INFO 22:06:18,103 ProgressMeter - chrY:59361069 6.7818693E7 69.2 m 61.0 s 98.7% 70.1 m 55.0 s
INFO 22:06:24,494 VariantRecalibratorEngine - Evaluating full set of 3869624 variants...
INFO 22:06:24,658 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000.

ERROR --
ERROR stack trace

java.lang.IllegalArgumentException: No data found.
at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:88)
at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:489)
at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:185)
at org.broadinstitute.gatk.engine.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:115)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:316)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://software.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: No data found.
ERROR ----------------------------------------------------

The commands I used were as follows:
java -Xmx5G -jar GenomeAnalysisTK3.7.jar -T SelectVariants -R hg19.fasta -V NA12878_1.vcf.gz -selectType SNP --excludeNonVariants -o NA12878_1.raw.snp.vcf.gz && \
java -Xmx5G -jarGenomeAnalysisTK3.7.jar -T VariantRecalibrator -R hg19.fasta -input NA12878_1.raw.snp.vcf.gz \
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 ./gatk/hapmap_3.3.hg19.vcf \
-resource:omni,known=false,training=true,truth=true,prior=12.0 ./gatk/1000G_omni2.5.hg19.vcf \
-resource:1000G,known=false,training=true,truth=false,prior=10.0 ./gatk/1000G_phase1.snps.high_confidence.hg19.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ./gatk/dbsnp_138.hg19.vcf \
-an DP -an QD -an FS -an SOR -an ReadPosRankSum -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
-recalFile NA12878_1.recalibrate_SNP.recal -tranchesFile NA12878_1.recalibrate_SNP.tranches -rscriptFile NA12878_1.recalibrate_SNP_plots.R

What do I do next?

PS, I get correct results from the other two WGS data using the same command.

Thank you for your help in advance,
Kind regards,

Tagged:

Answers

Sign In or Register to comment.