ERROR Stack trace: java.lang.NumberFormatException: For input string: "2520503224"

bmlett

I am working with GenomeSTRiP v2.0 on some bovine samples. Java is version 1.8.0_201. I created my own reference ploidy map as discussed on the Reference Genome Metadata page using the locations from my reference index:

X 139009144 2520503224 F 2
X 139009144 2520503224 M 1
Y 43300181 2661249986 F 0
Y 43300181 2661249986 M 1
* * * * 2

However, I a running into this error:

##### ERROR stack trace
java.lang.NumberFormatException: For input string: "2520503224"
at java.lang.NumberFormatException.forInputString(NumberFormatException. java:65)
at java.lang.Integer.parseInt(Integer.java:583)
at java.lang.Integer.parseInt(Integer.java:615)
at org.broadinstitute.sv.metadata.ploidy.PloidyMap.parsePloidyMapFile(Pl oidyMap.java:252)
at org.broadinstitute.sv.metadata.ploidy.PloidyMap.open(PloidyMap.java:5 5)
at org.broadinstitute.sv.metadata.depth.ComputeReadDepthCoverageWalker.i nitialize(ComputeReadDepthCoverageWalker.java:131)
at org.broadinstitute.sv.metadata.ComputeMetadataWalker.initialize(Compu teMetadataWalker.java:202)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute (LinearMicroScheduler.java:83)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAna lysisEngine.java:316)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandL ineExecutable.java:123)
at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:1 41)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(Co mmandLineProgram.java:256)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(Co mmandLineProgram.java:158)
at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:91)
at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:65)
##### ERROR -------------------------------------------------------------------- ----------------------

I am not sure if this is an exception based on my ploidy map or a format issue with java and the size of the integer.
Any insight would be greatly appreciated!

(I apologize for any format issues)


  bhandsaker

    I'm pretty sure it is the magnitude of the integer. Those are pretty big chromosomes!
    Is this correct? I see that the Bos taurus genome is about 2.6 gigabases in total.

    While it would be relatively easy to fix this particular case, I believe there are many places in Genome STRiP where correctly handling individual chromosomes larger than 2Gb would require code changes.

  bmlett

    You are most likely correct that there was an error in my chromosome size! I am not 100% sure where I found the end sizes but I believe the issue was that the start should be 1 and the end value is what I listed as a start. Changing this seems to have allowed that step to start without errors.

    Thank you!
