Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

Any way to get past "Clustering with this few variants and these annotations is unsafe."?

pierredewitpierredewit Posts: 1Member

Hi team, thanks for a great job developing this software!

I am planning to use the GATK in a class as a demo of how to do SNP detection and the VQSR in a non-model organism, but due to time constraints I have a very small dataset (12 samples of 100K reads each).

I am using a SNP Q>20 for an initial round of SNP detection, which I then use as a "true" training set for the VQSR and use a call set with Q>3 as my variants of interest.

I keep getting the error message "NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)"

which is not surprising, even though I have already set --maxGaussians 2 -percentBad 0.01 -minNumBad 50

to reiterate, this is for educational purposes - I am wondering if I can move past this error message and get an output file despite this error?

Thanks!

/Pierre De Wit

Tagged:

Best Answer

Sign In or Register to comment.