The current GATK version is 3.4-46

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

error while running VariantRecalibrator

Posts: 6Member
edited November 2012

experiment: target enrichment and sequencing using Illumina platform

raw VCF file from UnifiedGenotyper -> Variantrecalibrator

I got the following error, any potential explanation? Thanks

##### **ERROR MESSAGE: Bad input: Error during negative model training. Minimum number of variants to use in training is larger than the whole call set. One can attempt to lower the --minNumBadVariants arugment but this is unsafe.**


I tried to lower this number, but different error message came up.

script used:

java -Xmx4g -jar GenomeAnalysisTK.jar -T VariantRecalibrator \
-mode BOTH -nt 4 \
-R hg19_all_MT.fasta \
-input two.final.vcf \
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.fy_left.vcf \
-resourcemni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.site.fy.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=8.0 dbsnp137_sort_fy_left.vcf \
-recalFile two.final.vcf.reca \
-tranchesFile two.final.vcf.tranches \
-rscriptFile two.final.vcf.R \
-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -tranche 85.0 -tranche 80.0 \
-an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an MQ -an FS -an HRun

Post edited by Geraldine_VdAuwera on
Tagged:

Your first error means what it says: you don't have enough variants to train the recalibration model. This is explained in the documentation on variant recalibration.

What is the second error message?

By the way, if the command you copied here is copied exactly from what you used, I can tell you there's a problem at the line that starts with -resourcemni.

Geraldine Van der Auwera, PhD

• Posts: 6Member

Hi, Geraldine,

Sorry, it was my typo when I posted code. In my code, it was "resource:omni,"

After I get this error, I added

I get the following errors:
"

ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)

"
Is it due to my incorrect "resource" file?