Indels Recalibration error message

rcholicrcholic DenverPosts: 68Member

I am trying to recalibrate my VCF files for Indels calling using the below command lines:

java -Xmx2G -jar ../GenomeAnalysisTK.jar -T VariantRecalibrator \ -R ../GATK_ref/hg19.fasta \ -input ./Variants/gcat_set_053_2.raw.snps.indels.vcf \ -nt 4 \ -resource:mills,known=false,training=true,truth=true,prior=12.0 ../GATK_ref/Mills_and_1000G_gold_standard.indels.hg19.vcf \ -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ../GATK_ref/dbsnp_137.hg19.vcf \ -an DP -an FS -an ReadPosRankSum -an MQRankSum \ --maxGaussians 4 \ -percentBad 0.05 \ -minNumBad 1000 \ -mode INDEL \ -recalFile ./Variants/VQSR/gcat_set_053_2.indels.vcf.recal \ -tranchesFile ./Variants/VQSR/gcat_set_053_2.indels.tranches \ -rscriptFile ./Variants/VQSR/gcat_set_053_2.indels.recal.plots.R > ./Variants/VQSR/IndelRecal2-noAnnot.log

I got this error message, even after taking the recommendation (e.g. maxGaussians 4, --percentBad 0.05). What does this error message mean? my files have too few variants? It's exome-seq.

##### ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)


Sign In or Register to comment.