The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

VariantRecalibrator - java.lang.NumberFormatException: For input string: "."

LaviniaLavinia Member Posts: 37

Hello, I am just trying VariantRecalibrator on my 4 samples:

java -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R gatk.ucsc.mm10.fa -input UnifiedGenotyper.output.snps.raw.vcf -nt 14 -recalFile file_for_ApplyRecalibration.recal -tranchesFile file_for_ApplyRecalibration.tranches -resource:sanger,known=false,training=true,truth=true mgp.v2.snps.annot.reformat.vcf -resource:dbnsp,known=true,training=false,truth=false,prior=6.0 mm10_dbsnp.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an InbreedingCoeff -mG 4 -percentBad 0.05

which starts running then gives me this error:
INFO 08:26:57,741 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NumberFormatException: For input string: "."
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:481)
at java.lang.Integer.valueOf(Integer.java:582)
at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec.decodeInts(AbstractVCFCodec.java:680)
at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec.createGenotypeMap(AbstractVCFCodec.java:641)
at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec$LazyVCFGenotypesParser.parse(AbstractVCFCodec.java:92)
at org.broadinstitute.sting.utils.variantcontext.LazyGenotypesContext.decode(LazyGenotypesContext.java:130)
at org.broadinstitute.sting.utils.variantcontext.LazyGenotypesContext.getGenotypes(LazyGenotypesContext.java:120)
at org.broadinstitute.sting.utils.variantcontext.GenotypesContext.iterator(GenotypesContext.java:461)
at org.broadinstitute.sting.utils.variantcontext.VariantContext.getCalledChrCount(VariantContext.java:922)
at org.broadinstitute.sting.utils.variantcontext.VariantContext.getCalledChrCount(VariantContext.java:908)
at org.broadinstitute.sting.utils.variantcontext.VariantContext.isMonomorphicInSamples(VariantContext.java:937)
at org.broadinstitute.sting.utils.variantcontext.VariantContext.isPolymorphicInSamples(VariantContext.java:948)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.isValidVariant(VariantDataManager.java:278)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.parseTrainingSets(VariantDataManager.java:263)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.map(VariantRecalibrator.java:259)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.map(VariantRecalibrator.java:107)
at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243)
at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231)
at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:248)
at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219)
at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120)
at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67)
at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23)
at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:73)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-9-ge5ebf34):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: For input string: "."
ERROR ------------------------------------------------------------------------------------------

I've used all three VCFs in other GATK tools without issues.
Any help greatly appreciated!, many thanks, Lavinia.

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,139 admin

    Can you validate your vcf, just to make sure there's nothing wrong with the file itself?

    You should also try running with the latest version.

    Geraldine Van der Auwera, PhD

  • LaviniaLavinia Member Posts: 37

    Hi Geraldine, thanks so much for the prompt response. We are not upgrading our version to make sure that we don't run into any issues with analysis for academic/commercial analysis (99.9% of our work is academic but there is a chance some of it could cross over). Using vcftools validation, two VCFs validate immediately with no issues, the third complains with a warning:
    The header tag 'contig' not present for CHROM=chr1. (Not required but highly recommended.)
    (for every chromosome).
    I'll add these headers in and repeat, and comment back on any progress, thank you.

  • LaviniaLavinia Member Posts: 37

    Hi Geraldine, I have corrected the one VCF with the header issues but unfortunately am still getting exactly the same error:

    INFO 14:31:23,744 ProgressMeter - chr1:18999977 8.42e+05 36.0 s 42.7 s 0.7% 86.2 m 85.6 m
    WARN 14:31:23,901 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately 3888 seconds. Retrying connection.
    INFO 14:31:24,600 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.NumberFormatException: For input string: "."
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:481)
    at java.lang.Integer.valueOf(Integer.java:582)
    at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec.decodeInts(AbstractVCFCodec.java:680)
    at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec.createGenotypeMap(AbstractVCFCodec.java:641)
    at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec$LazyVCFGenotypesParser.parse(AbstractVCFCodec.java:92)
    at org.broadinstitute.sting.utils.variantcontext.LazyGenotypesContext.decode(LazyGenotypesContext.java:130)
    at org.broadinstitute.sting.utils.variantcontext.LazyGenotypesContext.getGenotypes(LazyGenotypesContext.java:120)
    at org.broadinstitute.sting.utils.variantcontext.GenotypesContext.iterator(GenotypesContext.java:461)
    at org.broadinstitute.sting.utils.variantcontext.VariantContext.getCalledChrCount(VariantContext.java:922)
    at org.broadinstitute.sting.utils.variantcontext.VariantContext.getCalledChrCount(VariantContext.java:908)
    at org.broadinstitute.sting.utils.variantcontext.VariantContext.isMonomorphicInSamples(VariantContext.java:937)
    at org.broadinstitute.sting.utils.variantcontext.VariantContext.isPolymorphicInSamples(VariantContext.java:948)
    at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.isValidVariant(VariantDataManager.java:278)
    at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.parseTrainingSets(VariantDataManager.java:263)
    at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.map(VariantRecalibrator.java:259)
    at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.map(VariantRecalibrator.java:107)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231)
    at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:248)
    at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23)
    at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:73)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.3-9-ge5ebf34):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: For input string: "."
    ERROR ------------------------------------------------------------------------------------------

    I'll try the VariantRecalibrator tool on some other data and have a closer look at the VCFs, maybe there is some quirk in them that is causing issues, thanks.

  • LaviniaLavinia Member Posts: 37

    Hi, sorry just discovered ValidateVariants (!). This gives me:

    File mgp.v2.snps.annot.reformat.vcf fails strict validation: the REF allele is incorrect for the record at position chr1:3000054, fasta says G vs. VCF says C

    I'll check this and update, thanks.

  • LaviniaLavinia Member Posts: 37

    Hi Geraldine, thanks for that. I think it must be something within one of the VCFs, and exactly as you said, some information is missing. I'll try and chase this up and post my findings, many thanks.

  • LaviniaLavinia Member Posts: 37

    Ok, so I think your suggestions have helped me find the problem, the VCF causing issues is from ftp://ftp-mouse.sanger.ac.uk/REL-1211-SNPs_Indels/README, so it is a valid VCF but there must be differences between the reference sequence used by sanger and the reference sequence that I have used. I'll download their reference, re-run my pipeline to use that and then retry VariantRecalibrator. If this isn't the issue then I'll re-post, if not, consider the problem solved!, thanks.

Sign In or Register to comment.