Exception when processing alignment for BAM index

I used the SplitNCigarReads with the following command.

java -jar GenomeAnalysisTK.jar -T SplitNCigarReads -R test.fasta -I sorted2.bam -o sorted2_split.bam -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS

But I obtained following log and error messages.
I made index (bai file) using samtools index.
Previously I succeeded in the same way for a different reference sequence.
I wanted to ask you how to solve this problem.

15:08:19,279 HelpFormatter - --------------------------------------------------------------------------------
INFO 15:08:19,284 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56
INFO 15:08:19,284 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 15:08:19,284 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 15:08:19,288 HelpFormatter - Program Args: -T SplitNCigarReads -R test.fasta -I sorted2.bam -o sorted2_split.bam -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS
INFO 15:08:19,294 HelpFormatter - Executing as [email protected] on Linux 2.6.32-358.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14.
INFO 15:08:19,295 HelpFormatter - Date/Time: 2017/01/23 15:08:19
INFO 15:08:19,295 HelpFormatter - --------------------------------------------------------------------------------
INFO 15:08:19,295 HelpFormatter - --------------------------------------------------------------------------------
INFO 15:08:19,972 GenomeAnalysisEngine - Strictness is SILENT
INFO 15:08:20,056 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 15:08:20,065 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 15:08:20,098 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03
INFO 15:08:20,211 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 15:08:20,217 GenomeAnalysisEngine - Done preparing for traversal
INFO 15:08:20,217 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 15:08:20,218 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 15:08:20,218 ProgressMeter - Location | reads | elapsed | reads | completed | runtime | runtime
INFO 15:08:20,252 ReadShardBalancer$1 - Loading BAM index data
INFO 15:08:20,254 ReadShardBalancer$1 - Done loading BAM index data
INFO 15:08:51,252 ProgressMeter - chr1:88350198 300086.0 31.0 s 103.0 s 0.6% 85.1 m 84.6 m
INFO 17:15:37,400 ProgressMeter - chr14:480923759 2.6647719E7 2.1 h 4.8 m 100.0% 2.1 h 0.0 s
INFO 17:15:57,978 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A BAM/CRAM ERROR has occurred (version 3.5-0-g36282e4):
ERROR
ERROR This means that there is something wrong with the BAM/CRAM file(s) you provided.
ERROR The error message below tells you what is the problem.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum until you have followed these instructions:
ERROR - Make sure that your BAM file is well-formed by running Picard's validator on it
ERROR (see http://picard.sourceforge.net/command-line-overview.shtml#ValidateSamFile for details)
ERROR - Ensure that your BAM index is not corrupted: delete the current one and regenerate it with 'samtools index'
ERROR - Ensure that your CRAM index is not corrupted: delete the current one and regenerate it with
ERROR 'java -jar cramtools-3.0.jar index --bam-style-index --input-file --reference-fasta-file '
ERROR (see https://github.com/enasequence/cramtools/tree/v3.0 for details)
ERROR
ERROR MESSAGE: Exception when processing alignment for BAM index M02283:62:000000000-AC0LV:1:2104:17035:8414 1/2 219b aligned read.
ERROR ------------------------------------------------------------------------------------------

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    It looks like you have at least one read that is malformed in some way. Try validating your bam file with Picard ValidateSamFile. There's a doc in the "Common Problems" documentation that explains how to do it and proposes some solutions.
  • MizunoMizuno JapanMember

    Thank you for your comment.
    I tried Picard ValidateSamFile, and I obtained following error.
    What should I do ?

    ERROR: Read name w2, The platform (PL) attribute (platform) + was not one of the valid values for read group
    WARNING: Record 1, Read name M02283:57:000000000-AAY0N:1:1108:27989:19613, NM tag (nucleotide differences) is missing
    WARNING: Record 2, Read name M02283:29:000000000-A6W63:1:2118:20560:24621, NM tag (nucleotide differences) is missing
    WARNING: Record 3, Read name M02283:57:000000000-AAY0N:1:1102:18438:19899, NM tag (nucleotide differences) is missing
    WARNING: Record 4, Read name M02283:57:000000000-AAY0N:1:1116:9553:7908, NM tag (nucleotide differences) is missing
    WARNING: Record 5, Read name M02283:57:000000000-AAY0N:1:1116:9553:7908, NM tag (nucleotide differences) is missing
    WARNING: Record 6, Read name M02283:57:000000000-AAY0N:1:1108:13829:21753, NM tag (nucleotide differences) is missing

  • MizunoMizuno JapanMember

    Then, I tried Picard ValidateSamFile with IGNORE_WARNINGS=true and MODE=VERBOSE.
    The following messages were obtained.

    ERROR: Read name w2, The platform (PL) attribute (platform) + was not one of the valid values for read group
    ERROR: Unexpected number of metadata chunks 3

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    That sounds bad -- your bam seems to be malformed or damaged. What program did you use to generate it?
  • MizunoMizuno JapanMember

    I mapped reads against a reference using STAR.
    Then I carried out picard AddOrReplaceReadGroups.
    Previously there were no problems by the same way using the older version of reference.
    I wonder if the large chromosome size (more than 500 Mbp) is causing problems.
    Previously I cut off the chromosome sequence less than 500 Mbp.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Oh yes -- most of the tools are unable to correctly process any contigs longer than 512 Mb, it's a known limitation.
  • MizunoMizuno JapanMember

    Are there any ways to solve the problem?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    For now the workaround is to segment the chromosome, as you've apparently done previously. But we have a ticket to see if that may be improved. Would you be able to share your reference and a snippet of the bam file around the problem region? It would help us investigate possible solutions.
Sign In or Register to comment.