# VariantRecalibrator - java.lang.NumberFormatException: For input string: "."

Hello, I am just trying VariantRecalibrator on my 4 samples:

java -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R gatk.ucsc.mm10.fa -input UnifiedGenotyper.output.snps.raw.vcf -nt 14 -recalFile file_for_ApplyRecalibration.recal -tranchesFile file_for_ApplyRecalibration.tranches -resource:sanger,known=false,training=true,truth=true mgp.v2.snps.annot.reformat.vcf -resource:dbnsp,known=true,training=false,truth=false,prior=6.0 mm10_dbsnp.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an InbreedingCoeff -mG 4 -percentBad 0.05

which starts running then gives me this error: INFO 08:26:57,741 GATKRunReport - Uploaded run statistics report to AWS S3

##### ERROR ------------------------------------------------------------------------------------------

I'll try the VariantRecalibrator tool on some other data and have a closer look at the VCFs, maybe there is some quirk in them that is causing issues, thanks.

Hi, sorry just discovered ValidateVariants (!). This gives me:

File mgp.v2.snps.annot.reformat.vcf fails strict validation: the REF allele is incorrect for the record at position chr1:3000054, fasta says G vs. VCF says C

I'll check this and update, thanks.

Hi Geraldine, thanks for that. I think it must be something within one of the VCFs, and exactly as you said, some information is missing. I'll try and chase this up and post my findings, many thanks.

Ok, so I think your suggestions have helped me find the problem, the VCF causing issues is from ftp://ftp-mouse.sanger.ac.uk/REL-1211-SNPs_Indels/README, so it is a valid VCF but there must be differences between the reference sequence used by sanger and the reference sequence that I have used. I'll download their reference, re-run my pipeline to use that and then retry VariantRecalibrator. If this isn't the issue then I'll re-post, if not, consider the problem solved!, thanks.