The current GATK version is 3.2-2

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

# VariantRecalibrator - java.lang.NumberFormatException: For input string: "."

Posts: 37Member

Hello, I am just trying VariantRecalibrator on my 4 samples:

java -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R gatk.ucsc.mm10.fa -input UnifiedGenotyper.output.snps.raw.vcf -nt 14 -recalFile file_for_ApplyRecalibration.recal -tranchesFile file_for_ApplyRecalibration.tranches -resource:sanger,known=false,training=true,truth=true mgp.v2.snps.annot.reformat.vcf -resource:dbnsp,known=true,training=false,truth=false,prior=6.0 mm10_dbsnp.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an InbreedingCoeff -mG 4 -percentBad 0.05

which starts running then gives me this error: INFO 08:26:57,741 GATKRunReport - Uploaded run statistics report to AWS S3

##### ERROR ------------------------------------------------------------------------------------------

I'll try the VariantRecalibrator tool on some other data and have a closer look at the VCFs, maybe there is some quirk in them that is causing issues, thanks.

• Posts: 37Member

Hi, sorry just discovered ValidateVariants (!). This gives me:

File mgp.v2.snps.annot.reformat.vcf fails strict validation: the REF allele is incorrect for the record at position chr1:3000054, fasta says G vs. VCF says C

I'll check this and update, thanks.

• Posts: 37Member

Hi Geraldine, thanks for that. I think it must be something within one of the VCFs, and exactly as you said, some information is missing. I'll try and chase this up and post my findings, many thanks.

• Posts: 37Member

Ok, so I think your suggestions have helped me find the problem, the VCF causing issues is from ftp://ftp-mouse.sanger.ac.uk/REL-1211-SNPs_Indels/README, so it is a valid VCF but there must be differences between the reference sequence used by sanger and the reference sequence that I have used. I'll download their reference, re-run my pipeline to use that and then retry VariantRecalibrator. If this isn't the issue then I'll re-post, if not, consider the problem solved!, thanks.