Complete this survey about your research needs and be entered to win an Amazon gift card or FireCloud credit.
Read more about it here!
Download the latest Picard release at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.6 is out. See the GATK4 beta page for download and details.

ERROR MESSAGE: Cannot enable index memory mapping for a SAM text reader

Hello Bob,

we've been successfully running GenomeSTRiP on more than 50 genomes, but now I am getting an error when running on a new project with 13 genomes. They were preprocessed as three separate batches (two batches with 5 genomes and one batch with 3 genomes).
When I start discovery on the three metadata folders (using the -md option three times), I get this error:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.UnsupportedOperationException: Cannot enable index memory mapping for a SAM text reader
at net.sf.samtools.SAMTextReader.enableIndexMemoryMapping(SAMTextReader.java:107)
at net.sf.samtools.SAMFileReader.enableIndexMemoryMapping(SAMFileReader.java:230)
at org.broadinstitute.sv.dataset.DataSet.openSAMFile(DataSet.java:98)
at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runTraversal(DeletionDiscoveryAlgorithm.java:139)
at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:108)
at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43)
at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:97)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:123)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147)
at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:77)
at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-0-g28e02c2):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Cannot enable index memory mapping for a SAM text reader
ERROR ------------------------------------------------------------------------------------------

Do you have an idea what might be causing the error? I am running svtoolkit_1.04.1162.

And another question: I know it is recommended to pool at least 20 to 30 genomes for GenomeSTRiP. Is 13 genomes too few and unrecommended or is there something I need to watch out with when interpreting results? All 13 genomes are high coverage, 30-40x.

Thanks a lot!
And best regards,

Anne-Katrin

Best Answer

Answers

  • bhandsakerbhandsaker Member, Broadie, Moderator

    The error message suggests that one of your input files is not a valid bam file. The code is guessing it is a sam (text) file, but that may not be true. Maybe it is corrupted? I would try "file" and "samtools view" to check for bad input files.

    With respect to number of samples, I don't know precisely how much better the algorithms do with more samples.
    If you can use some of your other 50 samples as a background population (i.e. they are aligned to the same reference, the read lengths are at least as long), then you could try calling the 13 samples and then recall (say) chr20, with the 13 + 50 and see if the results are better. I think it is not too bad to mix highcov/lowcov samples together (even when calling the highcov samples).

    If you do this, I would love to know how much the extra samples helped or didn't help.

  • akemdeakemde NYCMember

    Thanks! We will run the test you suggested and let you know how that affected results.

    I verified that the input BAM files are intact, but I am still getting the SAM text reader error. Do you have any idea what else to look at?

  • akemdeakemde NYCMember

    This was it! The file listing the BAM files was called .bamlist, renaming to .list did the trick.

    Thanks a lot!

Sign In or Register to comment.