Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

# Gaussian Mixture model plot interpret

BostonPosts: 37Member

I got this plot after VariantRecalibration for 42 samples in a VCF file. As it can bee seen in the plot there is no "known" variants detected. What is the problem? Which walker do you recommend to solve this issue? thanks

Geraldine Van der Auwera, PhD

• BostonPosts: 37Member

I used the following commands:

bsub -q short -W 12:0 -R "rusage[mem=32000]" -N -o /hms/scratch1/mahyar/error.log java -jar GenomeAnalysisTK-2.8-1-g932cd3a/GenomeAnalysisTK.jar \ -T VariantRecalibrator \ -R /hms/scratch1/mahyar/ucsc.hg19.fasta \ --input /hms/scratch1/mahyar/Overal-42post-RGSM-allsites.vcf \ --resource:dbsnp,VCF,known=false,training=true,truth=true,prior=6.0 /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/dbsnp_137.hg19.vcf \ -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ \ --mode SNP \ -rf BadCigar \ --recal_file /hms/scratch1/mahyar/All42_post_VQRS.recal \ --tranches_file /hms/scratch1/mahyar/All42_post_VQRS.tranches \ --rscript_file /hms/scratch1/mahyar/All42_post_VQRS_plots.R

The reason you have no known variants in the plots is because you're not providing any set of knowns (you have known=false for the one resource you provide).

The bigger problem here is that you're not following our Best Practices for variant recalibration. This command will give you very poor results. Please read the documentation on the Best Practices to learn how you should do this.

• BostonPosts: 37Member

Do I need use all 4 resources (e.g. hapmap, omni, 1000G, dbsnp) for the VariantRecalibrator or only one resource is enough? I used "dbsnp" resource, because I used it for calling variants via UnifiedGenotyper!

• BostonPosts: 37Member

I run again VariantRecalibrator only for one sample and got the following error: What is the issue?