Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Error when running CNVDiscovery in a batch-like way: “Read count cache file is truncated”

Dear Genome STRiP users,

I am running CNVDiscovery pipeline in a batch-like way, and always fail in No.4 batch, and No.23 batch with the following error:

INFO  02:38:02,459 RefineCNVBoundaries - Initialized data set: 1 file, 769 read groups, 98 samples. 
INFO  02:38:02,927 ReadCountCache - Initializing read count cache with 1 file. 
mInputFile=file:///proj/yunligrp/users/minzhi/gs_test_svpreprocess_fulllist_batch_success/4/md_tempdir/rccache.bin mCurrentSequenceName=chr16; mCurrentPosition=500001
Exception in thread "main" java.lang.RuntimeException: Read count cache file file:///proj/yunligrp/users/minzhi/gs_test_svpreprocess_fulllist_batch_success/4/md_tempdir/rccache.bin is truncated
    at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:65)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
    at org.broadinstitute.sv.commandline.CommandLineProgram.runAndReturnResult(CommandLineProgram.java:29)
    at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:25)
    at org.broadinstitute.sv.genotyping.RefineCNVBoundaries.main(RefineCNVBoundaries.java:133)
Caused by: java.lang.RuntimeException: Read count cache file file:///proj/yunligrp/users/minzhi/gs_test_svpreprocess_fulllist_batch_success/4/md_tempdir/rccache.bin is truncated
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader$ReadCountDataIterator.decodeRow(ReadCountFileReader.java:516)
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader$ReadCountDataIterator.getReadCacheItems(ReadCountFileReader.java:470)
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader$ReadCountDataIterator.aggregateSampleReadCounts(ReadCountFileReader.java:476)
    at org.broadinstitute.sv.metadata.depth.ReadCountFileReader.getReadCounts(ReadCountFileReader.java:266)
    at org.broadinstitute.sv.common.ReadCountCache.getReadCounts(ReadCountCache.java:100)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.computeRefReadCounts(GenotypingDepthModule.java:295)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.computeRefReadCounts(GenotypingDepthModule.java:245)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.getReadCounts(GenotypingDepthModule.java:230)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.getCnpReadCounts(GenotypingDepthModule.java:217)
    at org.broadinstitute.sv.genotyping.GenotypingDepthModule.genotypeCnp(GenotypingDepthModule.java:141)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.genotypeCnp(BoundaryRefinementAlgorithm.java:287)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.refineOneBoundary(BoundaryRefinementAlgorithm.java:633)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.refineBoundaryStep(BoundaryRefinementAlgorithm.java:553)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.refineBoundaries(BoundaryRefinementAlgorithm.java:536)
    at org.broadinstitute.sv.genotyping.BoundaryRefinementAlgorithm.processVariant(BoundaryRefinementAlgorithm.java:232)
    at org.broadinstitute.sv.genotyping.RefineCNVBoundaries.run(RefineCNVBoundaries.java:204)
    at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:54)
    ... 5 more 
INFO  02:38:16,126 QGraph - Writing incremental jobs reports... 

I divided all 3418 samples into 33 batches: #0-31 with 100 as batch size, #32 with 218 as batch size.

Besides, I also successfully run the CNVDiscovery pipeline to all 3418 samples in an individual running. Does it mean that there is no error in my bam files?

May I have your suggestions? Thank you in advance.

Best regards,
Wusheng

Sign In or Register to comment.