Error when running CNVDiscovery in a batch-like way: “Read count cache file is truncated”

Dear Genome STRiP users,

I am running CNVDiscovery pipeline in a batch-like way, and always fail in No.4 batch, and No.23 batch with the following error:

INFO  02:38:02,459 RefineCNVBoundaries - Initialized data set: 1 file, 769 read groups, 98 samples. 
INFO  02:38:02,927 ReadCountCache - Initializing read count cache with 1 file. 
mInputFile=file:///proj/yunligrp/users/minzhi/gs_test_svpreprocess_fulllist_batch_success/4/md_tempdir/rccache.bin mCurrentSequenceName=chr16; mCurrentPosition=500001
Exception in thread "main" java.lang.RuntimeException: Read count cache file file:///proj/yunligrp/users/minzhi/gs_test_svpreprocess_fulllist_batch_success/4/md_tempdir/rccache.bin is truncated
    at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:65)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
    at org.broadinstitute.sv.commandline.CommandLineProgram.runAndReturnResult(CommandLineProgram.java:29)
    at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:25)
    at org.broadinstitute.sv.genotyping.RefineCNVBoundaries.main(RefineCNVBoundaries.java:133)
Caused by: java.lang.RuntimeException: Read count cache file file:///proj/yunligrp/users/minzhi/gs_test_svpreprocess_fulllist_batch_success/4/md_tempdir/rccache.bin is truncated
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader$ReadCountDataIterator.decodeRow(ReadCountFileReader.java:516)
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader$ReadCountDataIterator.getReadCacheItems(ReadCountFileReader.java:470)
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader$ReadCountDataIterator.aggregateSampleReadCounts(ReadCountFileReader.java:476)
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader.getReadCounts(ReadCountFileReader.java:266)
    at org.broadinstitute.sv.common.ReadCountCache.getReadCounts(ReadCountCache.java:100)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.computeRefReadCounts(GenotypingDepthModule.java:295)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.computeRefReadCounts(GenotypingDepthModule.java:245)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.getReadCounts(GenotypingDepthModule.java:230)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.getCnpReadCounts(GenotypingDepthModule.java:217)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.genotypeCnp(GenotypingDepthModule.java:141)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.genotypeCnp(BoundaryRefinementAlgorithm.java:287)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.refineOneBoundary(BoundaryRefinementAlgorithm.java:633)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.refineBoundaryStep(BoundaryRefinementAlgorithm.java:553)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.refineBoundaries(BoundaryRefinementAlgorithm.java:536)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.processVariant(BoundaryRefinementAlgorithm.java:232)
    at org.broadinstitute.sv.genotyping.RefineCNVBoundaries.run(RefineCNVBoundaries.java:204)
    at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:54)
    ... 5 more 
INFO  02:38:16,126 QGraph - Writing incremental jobs reports... 

I divided all 3418 samples into 33 batches: #0-31 with 100 as batch size, #32 with 218 as batch size.

Besides, I also successfully run the CNVDiscovery pipeline to all 3418 samples in an individual running. Does it mean that there is no error in my bam files?

May I have your suggestions? Thank you in advance.

Best regards,
Wusheng

Sign In or Register to comment.