It looks like you're new here. If you want to get involved, click one of these buttons!
Hi,
I run into an error at step of IndelRealigner for GATK v2.0 complaining about SAM/BAM file has inconsistent mapping information
here is the command I used (take out full path for clarity):
java -Xmx4g -jar /Path/GenomeAnalysisTK-2.1-8-g5efb575/bin/GenomeAnalysisTK.jar -T IndelRealigner -I /Path/myBam.bam -R /path/hg19.fa -targetIntervals /path/myBam.output.intervals -o /Path/my_realignedBam.bam -known /Path/bundle-1.5/hg19/Mills_and_1000G_ gold_standard.indels.hg19.sites.vcf -known /Path/bundle-1.5/hg19/1000G_phase1.indels.hg19.vcf
Here is the error message I encountered: ...
:10000:10163 145 chr1 * 37 108M chr14 59648529 * GCAAGACCAACAAGAAGATCGCCATTGCTAACTGTGGACAACTCTAATAAATTTGGCTTGTGTTTTATCTTAGCCACCACACTGTTCTTTCTG TAGCTCAAGAGAGTA @?BEC@BCB@DB@;=8BAB<8BDDDEFIIHEIHI>I<IIDHHIHDIIII@GIDIIIIICIIHIHIHIIIIIIBIIHIIDIHIIIIIIFIDI has inconsistent mapping information.
...
Anybody encountered similar issue? Advice would be greatly appreciated!
Mike
Huh. I'm surprised that the content of the read was modified in the error message. The two reads you posted look legal to me, but they do both have the inconsistency that GATK complained about. Positions in SAM files are 1-based, so a value of 0 means "unknown" - which means the first read you posted aligned somewhere on chr1, but we don't know where. It's reasonable for GATK to consider this unmapped, which leads to the same scenario I outlined before.
The second read has the same problem, but this time in the position of the read's mate. Again, the flags say the mate is mapped but there's no position provided. GATK may not choke on this read, though, because it might not look at the mate position information.
I've never seen bwa output alignments like this, my best suggestion would be to try aligning them again (maybe this is a filesystem/threading hiccup?). I'll note that BLAT aligns that first read to chr14:59648613, so even the chr1 entry is probably wrong.
Answers
This looks like a malformed bam - both the POS and TLEN fields are *, which is illegal according to the spec. The error is most likely "Inconsistent mapping information" because the parser treats it as unaligned due to the lack of a POS, but the FLAGs specify that it is aligned
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks for the input, however, the error message was from GATK, which somehow change the read content a bit. If I pulled out the reads directly from the bam file, they look normal in POS and TLEN fields. Below are the actual reads I pulled out from the bam file, which looks fine to me except for the POS as 0, not sure if that is the issue (the * in above message for the read are modification of reads within GATK error message, not sure why is that)
NCI-GA1:30:70BETAAXX:2:114:10000:10163 145 chr1 0 37 108M chr14 59648529 0 GCAAGACCAACAAGAAGATCGCCATTGCTAACTGTGGACAACTCTAATAAATTTGGCTTGTGTTTTATCTTAGCCACCACACTGTTCTTTCTGTAGCTCAAGAGAGTA @?BEC@BCB@DB@;=8BAB<8BDDDEFIIHEIHI>I<IIDHHIHDIIII@GIDIIIIICIIHIHIHIIIIIIBIIHIIDIHIIIIIIFIDI X0:i:1 X1:i:0 MD:Z:0 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XN:i:107 XO:i:0 XT:A:N NCI-GA1:30:70BETAAXX:2:114:10000:10163 97 chr14 59648529 37 108M chr1 0 0 TGGATGGCAAGCATGTGGTTTTTTGGCAAGGTAAAGACAGAAGGAATATCTTGGAAGGCACAGAGTGCTTTGGGTCCAGAAATGGCAAGACCAACAAGAAGATCGCCA HHHHHHHGHHEBHHHDGGBGGGEDGHHHHFHEHHHHHGHHGHHH>HHFHGHDHHHHHGDHHHH<HBHHFHDFFFBGHGHBEEB@EFHGEBDB3BB2@@>@>BB@@B@A X0:i:1 X1:i:0 MD:Z:108 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U
Plus, I had total 5 bam files, this is only bam file that GATK v 2.0 complained and the other 4 seem fine. (BTW, the exome-seq data was mapped with bwa and from illumina GA IIx)
Thanks in advance for any other advice!
Mike
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Sorry, my bad, the two reads I pasted above shall be separated (they stuck together in screen) NCI-GA1:30:70BETAAXX:2:114:10000:10163 145 chr1 0 37 108M chr14 59648529 0 GCAAGACCAACAAGAAGATCGCCATTGCTAACTGTGGACAACTCTAATAAATTTGGCTTGTGTTTTATCTTAGCCACCACACTGTTCTTTCTGTAGCTCAAGAGAGTA @?BEC@BCB@DB@;=8BAB<8BDDDEFIIHEIHI>I<IIDHHIHDIIII@GIDIIIIICIIHIHIHIIIIIIBIIHIIDIHIIIIIIFIDI X0:i:1 X1:i:0 MD:Z:0 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XN:i:107 XO:i:0 XT:A:N
NCI-GA1:30:70BETAAXX:2:114:10000:10163 97 chr14 59648529 37 108M chr1 0 0 TGGATGGCAAGCATGTGGTTTTTTGGCAAGGTAAAGACAGAAGGAATATCTTGGAAGGCACAGAGTGCTTTGGGTCCAGAAATGGCAAGACCAACAAGAAGATCGCCA HHHHHHHGHHEBHHHDGGBGGGEDGHHHHFHEHHHHHGHHGHHH>HHFHGHDHHHHHGDHHHH<HBHHFHDFFFBGHGHBEEB@EFHGEBDB3BB2@@>@>BB@@B@A X0:i:1 X1:i:0 MD:Z:108 RG:Z:70BETAAXX_Sample_F4 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U
Just realized that POS as 0 mean unmapped, based on http://picard.sourceforge.net/explain-flags.html, for the paired reads above flag 97 as Summary: read paired mate reverse strand first in pair
flag 145 as Summary: read paired read reverse strand second in pair
the flag did not say mapped or unmapped, could that be the issue?
Thanks
Mike
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Dear pdexheimer:
Thanks so much for the insight, which sounds very reasonable to me. I will realign this sample and see.
Thanks again and best Mike
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •