VQSR error: NaN LOD value assigned

songsysongsy University of MichiganPosts: 5Member
edited February 26 in Ask the GATK team
INFO  17:05:50,124 GenomeAnalysisEngine - Preparing for traversal 
INFO  17:05:50,144 GenomeAnalysisEngine - Done preparing for traversal 
INFO  17:05:50,144 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  17:05:50,145 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining 
INFO  17:05:50,166 TrainingSet - Found hapmap track:    Known = false   Training = true     Truth = true    Prior = Q15.0 
INFO  17:05:50,166 TrainingSet - Found omni track:  Known = false   Training = true     Truth = false   Prior = Q12.0 
INFO  17:05:50,167 TrainingSet - Found dbsnp track:     Known = true    Training = false    Truth = false   Prior = Q6.0 
INFO  17:06:20,149 ProgressMeter -     1:216404576        2.04e+06   30.0 s       14.0 s      7.0%         7.2 m     6.7 m 
INFO  17:06:50,151 ProgressMeter -     2:223579089        4.70e+06   60.0 s       12.0 s     15.2%         6.6 m     5.6 m 
INFO  17:07:20,159 ProgressMeter -      4:33091662        7.43e+06   90.0 s       12.0 s     23.3%         6.4 m     4.9 m 
INFO  17:07:50,161 ProgressMeter -      5:92527959        1.00e+07  120.0 s       11.0 s     31.4%         6.4 m     4.4 m 
INFO  17:08:20,162 ProgressMeter -       7:1649969        1.30e+07    2.5 m       11.0 s     39.8%         6.3 m     3.8 m 
INFO  17:08:50,168 ProgressMeter -     8:106975025        1.58e+07    3.0 m       11.0 s     48.4%         6.2 m     3.2 m 
INFO  17:09:20,169 ProgressMeter -    10:101433561        1.87e+07    3.5 m       11.0 s     57.4%         6.1 m     2.6 m 
INFO  17:09:50,170 ProgressMeter -     12:99334147        2.16e+07    4.0 m       11.0 s     66.1%         6.1 m     2.1 m 
INFO  17:10:20,171 ProgressMeter -     15:30577012        2.41e+07    4.5 m       11.0 s     75.4%         6.0 m    88.0 s 
INFO  17:10:52,409 ProgressMeter -      18:8763648        2.68e+07    5.0 m       11.0 s     83.5%         6.0 m    59.0 s 
INFO  17:11:22,410 ProgressMeter -     22:31598896        2.97e+07    5.5 m       11.0 s     92.2%         6.0 m    27.0 s 
INFO  17:11:33,135 VariantDataManager - QD:      mean = 17.48    standard deviation = 9.03 
INFO  17:11:33,516 VariantDataManager - HaplotypeScore:      mean = 3.03     standard deviation = 2.62 
INFO  17:11:33,882 VariantDataManager - MQ:      mean = 52.40    standard deviation = 2.98 
INFO  17:11:34,253 VariantDataManager - MQRankSum:   mean = 0.31     standard deviation = 1.02 
INFO  17:11:37,973 VariantDataManager - Training with 1024360 variants after standard deviation thresholding. 
INFO  17:11:37,977 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
INFO  17:11:53,065 ProgressMeter - GL000202.1:10465        3.08e+07    6.0 m       11.0 s     99.8%         6.0 m     0.0 s 
INFO  17:12:09,041 VariantRecalibratorEngine - Finished iteration 0. 
INFO  17:12:23,066 ProgressMeter - GL000202.1:10465        3.08e+07    6.5 m       12.0 s     99.8%         6.5 m     0.0 s 
INFO  17:12:30,492 VariantRecalibratorEngine - Finished iteration 5.    Current change in mixture coefficients = 0.08178 
INFO  17:12:51,054 VariantRecalibratorEngine - Finished iteration 10.   Current change in mixture coefficients = 0.05869 
INFO  17:12:53,072 ProgressMeter - GL000202.1:10465        3.08e+07    7.0 m       13.0 s     99.8%         7.0 m     0.0 s 
INFO  17:13:11,207 VariantRecalibratorEngine - Finished iteration 15.   Current change in mixture coefficients = 0.15237 
INFO  17:13:23,073 ProgressMeter - GL000202.1:10465        3.08e+07    7.5 m       14.0 s     99.8%         7.5 m     0.0 s 
INFO  17:13:31,503 VariantRecalibratorEngine - Finished iteration 20.   Current change in mixture coefficients = 0.13505 
INFO  17:13:51,768 VariantRecalibratorEngine - Finished iteration 25.   Current change in mixture coefficients = 0.05729 
INFO  17:13:53,080 ProgressMeter - GL000202.1:10465        3.08e+07    8.0 m       15.0 s     99.8%         8.0 m     0.0 s 
INFO  17:14:11,372 VariantRecalibratorEngine - Finished iteration 30.   Current change in mixture coefficients = 0.02607 
INFO  17:14:23,081 ProgressMeter - GL000202.1:10465        3.08e+07    8.5 m       16.0 s     99.8%         8.5 m     0.0 s 
INFO  17:14:24,730 VariantRecalibratorEngine - Convergence after 33 iterations! 
INFO  17:14:27,037 VariantRecalibratorEngine - Evaluating full set of 3860460 variants... 
INFO  17:14:51,111 VariantDataManager - Found 0 variants overlapping bad sites training tracks. 
INFO  17:14:55,071 VariantDataManager - Additionally training with worst 1000 scoring variants --> 1000 variants with LOD <= -30.5662. 
INFO  17:14:55,071 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
INFO  17:14:55,082 VariantRecalibratorEngine - Finished iteration 0. 
INFO  17:14:55,095 VariantRecalibratorEngine - Convergence after 4 iterations! 
INFO  17:14:55,096 VariantRecalibratorEngine - Evaluating full set of 3860460 variants... 
INFO  17:15:02,071 GATKRunReport - Uploaded run statistics report to AWS S3 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.7-2-g6bda569): 
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --numBad 3000, for example).
##### ERROR ------------------------------------------------------------------------------------------

My command is :

java -jar -Xmx4g GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar -T VariantRecalibrator -R human_g1k_v37.fasta -input NA12878_snp.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf -resource:omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=6.0 dbsnp_132.b37.vcf -an QD -an HaplotypeScore -an MQ -an MQRankSum --maxGaussians 4 -mode SNP -recalFile NA12878_recal.vcf -tranchesFile NA12878_tranches -rscriptFile NA12878.plots.R

Before I didn't use -maxGaussians 4, once an error suggested this, I tried but still got this error message...And I think that numBad is already deprecated. I don't understand why this error will happen. I'm doing GATK unifiedgenotyper on 1000Genomes high coverage bam file and then use VQSR to filter the snp.

Post edited by Geraldine_VdAuwera on
Tagged:

Best Answer

Sign In or Register to comment.