It looks like you're new here. If you want to get involved, click one of these buttons!
Hi there,
I get an error when I try to run GATK with the following command:
java -jar GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar -T RealignerTargetCreator -R reference.fa -I merged_bam_files_indexed_markduplicate.bam -o reads.intervals
However I get this error:
SAM/BAM file SAMFileReader{/merged_bam_files_indexed_markduplicate.bam} is malformed: Read HWI-ST303_0093:5:5:13416:34802#0 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK. Please use http://gatkforums.broadinstitute.org/discussion/59/companion-utilities-replacereadgroups to fix this problem
It suggest that it a header issue however my bam file has a header:
samtools view -h merged_bam_files_indexed_markduplicate.bam | grep ^@RG
@RG ID:test1 PL:Illumina PU:HWI-ST303 LB:test PI:75 SM:test CN:japan
@RG ID:test2 PL:Illumina PU:HWI-ST303 LB:test PI:75 SM:test CN:japan
when I grep the read within the error:
HWI-ST303_0093:5:5:13416:34802#0 99 1 1090 29 23S60M17S = 1150 160 TGTTTGGGTTGAAGATTGATACTGGAAGAAGATTAGAATTGTAGAAAGGGGAAAACGATGTTAGAAAGTTAATACGGCTTACTCCAGATCCTTGGATCTC GGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGFGGGGGGGGGDGFGFGGGGGFEDFGEGGGDGEG?FGGDDGFFDGGEDDFFFFEDG?E MD:Z:60 PG:Z:MarkDuplicates RG:Z:test1 XG:i:0 AM:i:29 NM:i:0 SM:i:29 XM:i:0 XO:i:0 XT:A:M
Following Picard solution:
java -XX:MaxDirectMemorySize=4G -jar picard-tools-1.85/AddOrReplaceReadGroups.jar I= test.bam O= test.header.bam SORT_ORDER=coordinate RGID=test RGLB=test RGPL=Illumina RGSM=test/ RGPU=HWI-ST303 RGCN=japan CREATE_INDEX=True
I get this error after 2 min.:
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 12247781, Read name HWI-ST303_0093:5:26:10129:50409#0, MAPQ should be 0 for unmapped read.`
Any recommendation on how to solve this issue ?
My plan is to do the following to resolve the issue:
picard/MarkDuplicates.jar I=test.bam O=test_markduplicate.bam M=test.matrix AS=true VALIDATION_STRINGENCY=LENIANT
samtools index test_markduplicate.bam
I see a lot of messages like below but the command still running:
Ignoring SAM validation error: ERROR: Record (number), Read name HWI-ST303_0093:5:5:13416:34802#0, RG ID on SAMRecord not found in header: test1
while running the command
then try the GATK RealignerTargetCreator
I already tried to do the following
java -jar GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar -T RealignerTargetCreator -R reference.fa -I merged_bam_files_indexed_markduplicate.bam -o reads.intervals --validation_strictness LENIENT
But I still got the same error
N.B: the same command run with no issue with GATK version (1.2)
My pipeline in short: mapping the paired end reads with
bwa aln -q 20 ref.fa read > files.sai
bwa sampe ref.fa file1.sai file2.sai read1 read2 > test1.sam
samtools view -bS test1.sam | samtools sort - test
samtools index test1.bam
samtools merge -rh RG.txt test test1.bam test2.bam
RG.txt
@RG ID:test1 PL:Illumina PU:HWI-ST303 LB:test PI:75 SM:test CN:japan
@RG ID:test2 PL:Illumina PU:HWI-ST303 LB:test PI:75 SM:test CN:japan
samtools index test.bam
picard/MarkDuplicates.jar I=test.bam O=test_markduplicate.bam M=test.matrix AS=true VALIDATION_STRINGENCY=SILENT
samtools index test_markduplicate.bam
Answers
You need to fix your SAM file before you can proceed with any GATK analysis. I would recommend not using lenient validation with the Picard tools -- you should use strict validation and fix any problems that come up. Otherwise you're just going to get more problems later on.
If you can't figure out how to fix your sam file, I would recommend going back to the original data and reprocessing it. Validate your files at every step.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •see http://sourceforge.net/apps/mediawiki/picard/index.php?title=Main_Page#Q:Why_am_I_getting_errors_from_Picard_like.22MAPQ_should_be_0_for_unmapped_read.22_or_.22CIGAR_should_have_zero_elements_for_unmapped_read.3F.22
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •