Segment Mean must be finite

esalinasesalinas BroadMember, Broadie ✭✭✭

I ran GATK4 AllelicCNV program and get an error message saying "Segment Mean must be finite." (see below)

I look at the input files I give the program and no values were listed as "Inf" or "Infinity" or "-Inf" or anything like that.
All the values I saw (sorting numerically on the column a la "sort -g" ) showed values as low as -8 or so and as high as 8 or so with most values near 0. Because I saw values in the range +/- 8 I'm confused as to the source of the error. I'm not sure if this is a bug or a known issue?

-eddie

14:59:47.184 INFO  AllelicCNV - Shutting down engine
[June 20, 2017 2:59:47 PM UTC] org.broadinstitute.hellbender.tools.exome.AllelicCNV done. Elapsed time: 0.09 minutes.
Runtime.totalMemory()=910688256
java.lang.IllegalArgumentException: Segment Mean must be finite.
        at org.broadinstitute.hellbender.utils.param.ParamUtils.isFinite(ParamUtils.java:180)
        at org.broadinstitute.hellbender.tools.exome.ModeledSegment.<init>(ModeledSegment.java:22)
        at org.broadinstitute.hellbender.tools.exome.SegmentUtils.toModeledSegment(SegmentUtils.java:548)
        at org.broadinstitute.hellbender.tools.exome.SegmentUtils.lambda$readModeledSegmentsFromSegmentFile$721(SegmentUtils.java:185)
        at org.broadinstitute.hellbender.utils.tsv.TableUtils$1.createRecord(TableUtils.java:112)
        at org.broadinstitute.hellbender.utils.tsv.TableReader.fetchNextRecord(TableReader.java:355)
        at org.broadinstitute.hellbender.utils.tsv.TableReader.access$200(TableReader.java:94)
        at org.broadinstitute.hellbender.utils.tsv.TableReader$1.hasNext(TableReader.java:458)
        at java.util.Iterator.forEachRemaining(Iterator.java:115)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
        at org.broadinstitute.hellbender.tools.exome.SegmentUtils.readSegmentFile(SegmentUtils.java:421)
        at org.broadinstitute.hellbender.tools.exome.SegmentUtils.readModeledSegmentsFromSegmentFile(SegmentUtils.java:184)
        at org.broadinstitute.hellbender.tools.exome.AllelicCNV.runPipeline(AllelicCNV.java:290)
        at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:38)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:189)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:122)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:143)
        at org.broadinstitute.hellbender.Main.main(Main.java:221)
[email protected]:~/fc-9c84e685-79f8-4d84-9e52-640943257a9b/ef545f92-5053-4087-ad54-09fa884d0494/wgs_cnv_work/f2df0580-2445-4ea3-af41-d5a8a3107f5b$ 

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    It might be a bug -- what version are you using?

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    @Geraldine_VdAuwera I belive this is the JAR gatk-package-4.alpha.2-1136-gc18e780-SNAPSHOT-local.jar so the committ looks to be "gc18e780"

  • sleeslee Member, Broadie, Dev ✭✭

    @esalinas The segment means in the .seg file should be non log_2 (this is the output of the GATK CNV tool, as indicated in the javadoc), but the tangent normalized coverages in the tn.tsv should be log_2.

    @shlee Perhaps it's worth making this more explicit in the javadoc.

    Future versions of the CNV pipeline will only deal with raw (integer) read-count coverage and only output non log_2 copy-ratio estimates, which will hopefully prevent such sources of confusion.

    Issue · Github
    by shlee

    Issue Number
    3156
    State
    open
    Last Updated
    Assignee
    Array
  • esalinasesalinas BroadMember, Broadie ✭✭✭
    edited June 2017

    @slee thanks for this info that tells about the expected log-2-space-status. It would seem like any data file with a "Mean" or "Average" column with fractional values, particularly that are negative, I would guess are log2. A file with such values that are non-negative integer or have only zeroes after a decimal point I would guess are non-log-2. I can write a script that transforms a file to take log-2 or do 2^x.

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    @slee, I've updated the docs.

Sign In or Register to comment.