We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Mutect2 failing during PON creation

Hi,

I'm running Mutect2 (GATK v4.1.4) on Terra using a modified version of the Somatic-SNVs-Indels-GATK4 workspace, trying to create a panel of normals, and it's failing with an error that I can't troubleshoot.

The command is (with mybucket and mysample replaced with the actual strings):

java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx3000m -jar /root/gatk.jar GetSampleName -R gs://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta -I gs://mybucket/mysample.bam -O tumor_name.txt -encode

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/tmp.5cdcab87

The error I get is:
18:36:12.658 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 237.27721890900003 18:36:12.658 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 121.35 sec 18:36:12.660 INFO Mutect2 - Shutting down engine [November 14, 2019 6:36:12 PM UTC] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 25.57 minutes. Runtime.totalMemory()=432865280 htsjdk.samtools.util.RuntimeIOException: Expected to read 4 bytes, but expired stream after 0. at htsjdk.samtools.IndexStreamBuffer.readFully(IndexStreamBuffer.java:28) at htsjdk.samtools.IndexStreamBuffer.readInteger(IndexStreamBuffer.java:56) at htsjdk.samtools.AbstractBAMFileIndex.readInteger(AbstractBAMFileIndex.java:443) at htsjdk.samtools.AbstractBAMFileIndex.query(AbstractBAMFileIndex.java:272) at htsjdk.samtools.CachingBAMFileIndex.getQueryResults(CachingBAMFileIndex.java:159) at htsjdk.samtools.CachingBAMFileIndex.getSpanOverlapping(CachingBAMFileIndex.java:70) at htsjdk.samtools.BAMFileReader.getFileSpan(BAMFileReader.java:935) at htsjdk.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:952) at htsjdk.samtools.BAMFileReader.query(BAMFileReader.java:612) at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:533) at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.queryOverlapping(SamReader.java:405) at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.loadNextIterator(SamReaderQueryingIterator.java:125) at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.<init>(SamReaderQueryingIterator.java:66) at org.broadinstitute.hellbender.engine.ReadsDataSource.prepareIteratorsForTraversal(ReadsDataSource.java:404) at org.broadinstitute.hellbender.engine.ReadsDataSource.iterator(ReadsDataSource.java:330) at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.iterator(MultiIntervalLocalReadShard.java:134) at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.<init>(AssemblyRegionIterator.java:109) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:296) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:281) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206) at org.broadinstitute.hellbender.Main.main(Main.java:292)

Another sample and other shards run through this just fine. Thanks for any help -- I'm not sure what I need to post to make the issue clear, so thanks in advance for your patience. :)

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @lck

    How was best practices modified?

  • lcklck Member
    edited November 2019

    Sorry, should have been clearer -- I just changed the reference data to hg38 (since the workflow has b37 by default) and added my own relevant interval list.
    @bhanuGandham

    Post edited by lck on
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @lck

    I looked this up and it seems like this might be a connection-timeout issue. I have informed the Terra team about this and someone from that team will get back to you shortly.

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin
    edited December 2019

    Hello @lck - Looking into this and will get back to you shortly. In the meantime would you please share your stdout and stderr logs? Additionally, can you check the sample file to make sure that it is not corrupt by some chance. We have seen that sometimes files that are .sam are accidentally named with the .bam extension. What does the header of the bam file look like?

Sign In or Register to comment.