Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

error while running VariantRecalibrator

xiaojunzhaoxiaojunzhao Posts: 6Member
edited November 2012 in Ask the GATK team

experiment: target enrichment and sequencing using Illumina platform

raw VCF file from UnifiedGenotyper -> Variantrecalibrator

I got the following error, any potential explanation? Thanks

##### **ERROR MESSAGE: Bad input: Error during negative model training. Minimum number of variants to use in training is larger than the whole call set. One can attempt to lower the --minNumBadVariants arugment but this is unsafe.**

I tried to lower this number, but different error message came up.

script used:

java -Xmx4g -jar GenomeAnalysisTK.jar -T VariantRecalibrator \
-mode BOTH -nt 4 \
-R hg19_all_MT.fasta \
-input two.final.vcf \
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.fy_left.vcf \
-resourcemni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.site.fy.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=8.0 dbsnp137_sort_fy_left.vcf \
-recalFile two.final.vcf.reca \
-tranchesFile two.final.vcf.tranches \
-rscriptFile two.final.vcf.R \
-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -tranche 85.0 -tranche 80.0 \
-an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an MQ -an FS -an HRun 
Post edited by Geraldine_VdAuwera on

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,224Administrator, GATK Developer admin

    Your first error means what it says: you don't have enough variants to train the recalibration model. This is explained in the documentation on variant recalibration.

    What is the second error message?

    By the way, if the command you copied here is copied exactly from what you used, I can tell you there's a problem at the line that starts with -resourcemni.

    Geraldine Van der Auwera, PhD

  • xiaojunzhaoxiaojunzhao Posts: 6Member

    Hi, Geraldine,

    Sorry, it was my typo when I posted code. In my code, it was "resource:omni,"

    After I get this error, I added "--minNumBadVariants 1500"

    I get the following errors: "

    ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)

    " Is it due to my incorrect "resource" file?

Sign In or Register to comment.