GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

NaN LOD in VQSR

dcittarodcittaro Posts: 31Member

Hi all, I'm running VariantRecalibrator on a SNP set (47 exomes) and I get this error:

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.2-3-gde33222): 
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)
##### ERROR ------------------------------------------------------------------------------------------

this is the command line:

    java -Djava.io.tmpdir=/lustre2/scratch/  -Xmx32g -jar /lustre1/tools/bin/GenomeAnalysisTK-2.2-3.jar \
    -T VariantRecalibrator \
    -R /lustre1/genomes/hg19/fa/hg19.fa \
    -input /lustre1/workspace/Ferrari/Carrera/Analysis/UG/bpd_ug.SNP.vcf \
    -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 /lustre1/genomes/hg19/annotation/hapmap_3.3.hg19.sites.vcf.gz \
    -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 /lustre1/genomes/hg19/annotation/1000G_omni2.5.hg19.sites.vcf.gz \
    -resource:dbsnp,VCF,known=true,training=false,truth=false,prior=6.0 /lustre1/genomes/hg19/annotation/dbSNP-137.chr.vcf -an QD \
    -an HaplotypeScore \
    -an MQRankSum \
    -an ReadPosRankSum \
    -an FS \
    -an MQ \
    -an DP \
    -an QD \
    -an InbreedingCoeff \
    -mode SNP \
    -recalFile /lustre2/scratch/Carrera/Analysis2/snp.ug.recal.csv \
    -tranchesFile /lustre2/scratch/Carrera/Analysis2/snp.ug.tranches \
    -rscriptFile /lustre2/scratch/Carrera/Analysis2/snp.ug.plot.R \
    -U ALLOW_SEQ_DICT_INCOMPATIBILITY \
    --maxGaussians 6

I've already tried to decrease the --maxGaussians option to 4, I've also added --percentBad option (setting it up to 0.12, as for INDEL) but I still get the error.
I've added the option -debug to see what's happening, but apparently this has been removed from GATK-2.2.
Any help is appreciated...
thanks

Tagged:

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,567Administrator, GATK Developer admin
    Answer ✓

    Hmm, that might be a bug in the error reporting, we'll check it out. Thanks for the update.

    Geraldine Van der Auwera, PhD

Answers

  • dcittarodcittaro Posts: 31Member

    Apparently the error vanishes if I remove -an QD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,567Administrator, GATK Developer admin
    Answer ✓

    Hmm, that might be a bug in the error reporting, we'll check it out. Thanks for the update.

    Geraldine Van der Auwera, PhD

  • rpoplinrpoplin Posts: 122GATK Developer mod
    edited November 2012

    Hi there,

    Can you please post the full log output from this run? There is some useful information in there about the annotation distributions, &c. that will help us narrow down any issues.

    Cheers,

    Post edited by rpoplin on
  • dcittarodcittaro Posts: 31Member

    I realized that the first code had two entries for -an QD, which I believe was the cause of the whole thing...

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,567Administrator, GATK Developer admin

    Good catch, I didn't notice that. We still need to fix the error handling though, since the program clearly isn't handling the problem correctly.

    Geraldine Van der Auwera, PhD

  • LaviniaLavinia Posts: 37Member

    This error also appears if you include some annotations with -an that VQSR doesn't like, e.g. I ran the command with lots of annotations, failed with the above error, re-ran with just -an BaseQRankSum -an DP -an FS -an HaplotypeScore -an MQ -an MQRankSum -an QD -an ReadPosRankSum = success.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,567Administrator, GATK Developer admin

    Hmm, can you be more specific about which annotations were involved in the failure?

    Geraldine Van der Auwera, PhD

  • LaviniaLavinia Posts: 37Member

    I stuck most of them in, so -an AC -an AF etc, it was a bit of a case of 'suck it and see' to get a better feel for the results, so probably not the ideal thing to do but it did give me the same error as above, cheers.

Sign In or Register to comment.