The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

#### ☞ Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Did we ask for a bug report?

Then follow instructions in Article#1894.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# SAM/BAM file has inconsistent mapping information

Member Posts: 103
edited January 2013

Hi,

I run into an error at step of IndelRealigner for GATK v2.0 complaining about SAM/BAM file has inconsistent mapping information

here is the command I used (take out full path for clarity):

java -Xmx4g -jar /Path/GenomeAnalysisTK-2.1-8-g5efb575/bin/GenomeAnalysisTK.jar -T IndelRealigner -I /Path/myBam.bam -R /path/hg19.fa -targetIntervals /path/myBam.output.intervals -o /Path/my_realignedBam.bam -known /Path/bundle-1.5/hg19/Mills_and_1000G_
gold_standard.indels.hg19.sites.vcf -known /Path/bundle-1.5/hg19/1000G_phase1.indels.hg19.vcf

Here is the error message I encountered:
...

##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/Path/myBam.bam} is malformed: read NCI-GA1:30:70BETAAXX:2:114

:10000:10163 145 chr1 * 37 108M chr14 59648529 * GCAAGACCAACAAGAAGATCGCCATTGCTAACTGTGGACAACTCTAATAAATTTGGCTTGTGTTTTATCTTAGCCACCACACTGTTCTTTCTG
TAGCTCAAGAGAGTA @?BEC@BCB@DB@;=8BAB<8BDDDEFIIHEIHI>I<IIDHHIHDIIII@GIDIIIIICIIHIHIHIIIIIIBIIHIIDIHIIIIIIFIDI has inconsistent mapping information.

##### ERROR ------------------------------------------------------------------------------------------

...

Anybody encountered similar issue? Advice would be greatly appreciated!

Mike

Post edited by Geraldine_VdAuwera on
Tagged:

• Member, Dev Posts: 544 ✭✭✭✭

Huh. I'm surprised that the content of the read was modified in the error message. The two reads you posted look legal to me, but they do both have the inconsistency that GATK complained about. Positions in SAM files are 1-based, so a value of 0 means "unknown" - which means the first read you posted aligned somewhere on chr1, but we don't know where. It's reasonable for GATK to consider this unmapped, which leads to the same scenario I outlined before.

The second read has the same problem, but this time in the position of the read's mate. Again, the flags say the mate is mapped but there's no position provided. GATK may not choke on this read, though, because it might not look at the mate position information.

I've never seen bwa output alignments like this, my best suggestion would be to try aligning them again (maybe this is a filesystem/threading hiccup?). I'll note that BLAT aligns that first read to chr14:59648613, so even the chr1 entry is probably wrong.

• Member, Dev Posts: 544 ✭✭✭✭

This looks like a malformed bam - both the POS and TLEN fields are *, which is illegal according to the spec. The error is most likely "Inconsistent mapping information" because the parser treats it as unaligned due to the lack of a POS, but the FLAGs specify that it is aligned

• Member Posts: 103

Thanks for the input, however, the error message was from GATK, which somehow change the read content a bit. If I pulled out the reads directly from the bam file, they look normal in POS and TLEN fields. Below are the actual reads I pulled out from the bam file, which looks fine to me except for the POS as 0, not sure if that is the issue (the * in above message for the read are modification of reads within GATK error message, not sure why is that)

NCI-GA1:30:70BETAAXX:2:114:10000:10163 145 chr1 0 37 108M chr14 59648529 0 GCAAGACCAACAAGAAGATCGCCATTGCTAACTGTGGACAACTCTAATAAATTTGGCTTGTGTTTTATCTTAGCCACCACACTGTTCTTTCTGTAGCTCAAGAGAGTA @?BEC@BCB@DB@;=8BAB<8BDDDEFIIHEIHI>I<IIDHHIHDIIII@GIDIIIIICIIHIHIHIIIIIIBIIHIIDIHIIIIIIFIDI X0:i:1 X1:i:0 MD:Z:0 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XN:i:107 XO:i:0 XT:A:N
NCI-GA1:30:70BETAAXX:2:114:10000:10163 97 chr14 59648529 37 108M chr1 0 0 TGGATGGCAAGCATGTGGTTTTTTGGCAAGGTAAAGACAGAAGGAATATCTTGGAAGGCACAGAGTGCTTTGGGTCCAGAAATGGCAAGACCAACAAGAAGATCGCCA HHHHHHHGHHEBHHHDGGBGGGEDGHHHHFHEHHHHHGHHGHHH>HHFHGHDHHHHHGDHHHH<HBHHFHDFFFBGHGHBEEB@EFHGEBDB3BB2@@>;@>;BB@@B@A X0:i:1 X1:i:0 MD:Z:108 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U

Plus, I had total 5 bam files, this is only bam file that GATK v 2.0 complained and the other 4 seem fine. (BTW, the exome-seq data was mapped with bwa and from illumina GA IIx)

Thanks in advance for any other advice!

Mike

• Member Posts: 103

Sorry, my bad, the two reads I pasted above shall be separated (they stuck together in screen)
NCI-GA1:30:70BETAAXX:2:114:10000:10163 145 chr1 0 37 108M chr14 59648529 0 GCAAGACCAACAAGAAGATCGCCATTGCTAACTGTGGACAACTCTAATAAATTTGGCTTGTGTTTTATCTTAGCCACCACACTGTTCTTTCTGTAGCTCAAGAGAGTA @?BEC@BCB@DB@;=8BAB<8BDDDEFIIHEIHI>I<IIDHHIHDIIII@GIDIIIIICIIHIHIHIIIIIIBIIHIIDIHIIIIIIFIDI X0:i:1 X1:i:0 MD:Z:0 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XN:i:107 XO:i:0 XT:A:N

NCI-GA1:30:70BETAAXX:2:114:10000:10163 97 chr14 59648529 37 108M chr1 0 0 TGGATGGCAAGCATGTGGTTTTTTGGCAAGGTAAAGACAGAAGGAATATCTTGGAAGGCACAGAGTGCTTTGGGTCCAGAAATGGCAAGACCAACAAGAAGATCGCCA HHHHHHHGHHEBHHHDGGBGGGEDGHHHHFHEHHHHHGHHGHHH>HHFHGHDHHHHHGDHHHH<HBHHFHDFFFBGHGHBEEB@EFHGEBDB3BB2@@>;@>;BB@@B@A X0:i:1 X1:i:0 MD:Z:108 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U

Just realized that POS as 0 mean unmapped, based on http://picard.sourceforge.net/explain-flags.html, for the paired reads above
flag 97 as
Summary:
mate reverse strand
first in pair

flag 145 as
Summary:
second in pair

the flag did not say mapped or unmapped, could that be the issue?

Thanks

Mike

• Member, Dev Posts: 544 ✭✭✭✭

Huh. I'm surprised that the content of the read was modified in the error message. The two reads you posted look legal to me, but they do both have the inconsistency that GATK complained about. Positions in SAM files are 1-based, so a value of 0 means "unknown" - which means the first read you posted aligned somewhere on chr1, but we don't know where. It's reasonable for GATK to consider this unmapped, which leads to the same scenario I outlined before.

The second read has the same problem, but this time in the position of the read's mate. Again, the flags say the mate is mapped but there's no position provided. GATK may not choke on this read, though, because it might not look at the mate position information.

I've never seen bwa output alignments like this, my best suggestion would be to try aligning them again (maybe this is a filesystem/threading hiccup?). I'll note that BLAT aligns that first read to chr14:59648613, so even the chr1 entry is probably wrong.

• Member Posts: 103
edited October 2012

Dear pdexheimer:

Thanks so much for the insight, which sounds very reasonable to me. I will realign this sample and see.

Thanks again and best
Mike