ERROR MESSAGE: SAM/BAM file SAMFileReader is malformed

eem0306eem0306 Korea, Republic ofMember
edited January 2015 in Ask the GATK team

I tried to run MuTect with following command line;

/usr/lib/jvm/java-1.6.0/bin/java -Xmx2g -jar /home/exman/muTect-1.1.4.jar
--analysis_type MuTect
--reference_sequence /data/eem0306/ref/1.fa
--cosmic /data/eem0306/ref/b37_cosmic_v54_120711.vcf
--dbsnp /data/eem0306/ref/dbsnp_138.hg19.nochr.vcf
--input_file:normal /data/eem0306/somatic.caller/sample/N.rmdup.realigned.BQSR.h.bam
--input_file:tumor /data/eem0306/somatic.caller/using.mutect/myAnalysis20/N20.result.sorted.h.qsort.bwasw.h.filter.sorted.bam
--out /data/eem0306/somatic.caller/using.mutect/2nd.myAnalysis20/n20.results
--coverage_file n20.coverage.wig.txt

but I got error messages several times, like this

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.2-25-g2a68eab):
ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
ERROR Please do not post this error to the GATK forum
ERROR
ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: SAM/BAM file SAMFileReader{/data/eem0306/somatic.caller/using.mutect/myAnalysis20/N20.result.sorted.h.qsort.bwasw.h.filter.sorted.bam} is malformed: Read D0ENMACXX111207:7:1202:2132:140703 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK. Please use http://www.broadinstitute.org/gsa/wiki/index.php/ReplaceReadGroups to fix this problem
ERROR ------------------------------------------------------------------------------------------

the headers in my file is

@HD VN:1.4 GO:none SO:coordinate
@SQ SN:1 LN:249250621 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta AS:GRCh37 M5:1b22b98cdeb4a9304cb5d48026a85128 SP:Homo Sapiens
@RG ID:C09DF.1 PL:illumina PU:C09DFACXX111207.1.TTGAGCCT LB:Solexa-76163 DT:2011-12-07T14:00:00+0900 SM:HCC1143 BL CN:BI
@RG ID:C09DF.2 PL:illumina PU:C09DFACXX111207.2.TTGAGCCT LB:Solexa-76163 DT:2011-12-07T14:00:00+0900 SM:HCC1143 BL CN:BI
@RG ID:D0EN0.4 PL:illumina PU:D0EN0ACXX111207.4.TTGAGCCT LB:Solexa-76163 DT:2011-12-07T14:00:00+0900 SM:HCC1143 BL CN:BI
@RG ID:D0EN0.7 PL:illumina PU:D0EN0ACXX111207.7.TTGAGCCT LB:Solexa-76163 DT:2011-12-07T14:00:00+0900 SM:HCC1143 BL CN:BI
@RG ID:D0EN0.8 PL:illumina PU:D0EN0ACXX111207.8.TTGAGCCT LB:Solexa-76163 DT:2011-12-07T14:00:00+0900 SM:HCC1143 BL CN:BI
@RG ID:D0ENM.1 PL:illumina PU:D0ENMACXX111207.1.CCAGTTAG LB:Sage-75643 DT:2011-12-07T14:00:00+0900 SM:HCC1143 CN:BI
@RG ID:D0ENM.2 PL:illumina PU:D0ENMACXX111207.2.CCAGTTAG LB:Sage-75643 DT:2011-12-07T14:00:00+0900 SM:HCC1143 CN:BI
@RG ID:D0ENM.3 PL:illumina PU:D0ENMACXX111207.3.CCAGTTAG LB:Sage-75643 DT:2011-12-07T14:00:00+0900 SM:HCC1143 CN:BI
@RG ID:D0ENM.5 PL:illumina PU:D0ENMACXX111207.5.CCAGTTAG LB:Sage-75643 DT:2011-12-07T14:00:00+0900 SM:HCC1143 CN:BI
@RG ID:D0ENM.6 PL:illumina PU:D0ENMACXX111207.6.CCAGTTAG LB:Sage-75643 DT:2011-12-07T14:00:00+0900 SM:HCC1143 CN:BI
@RG ID:D0ENM.7 PL:illumina PU:D0ENMACXX111207.7.CCAGTTAG LB:Sage-75643 DT:2011-12-07T14:00:00+0900 SM:HCC1143 CN:BI
@PG ID:GATK IndelRealigner CL:knownAlleles=[(RodBinding name=knownAlleles source=/bio/lib/ref/1000G_phase1.indels.b37.vcf), (RodBinding name=knownAlleles2 source=/bio/lib/ref/Mills_and_1000G_gold_standard.indels.b37.vcf)] targetIntervals=n20t80.rmdup_intervals.list LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
@PG ID:GATK TableRecalibration VN:1.3-14-g59da26a CL:default_read_group=null default_platform=null force_read_group=null force_platform=null window_size_nqs=5 homopolymer_nback=7 exception_if_no_tile=false solid_recal_mode=SET_Q_ZERO solid_nocall_strategy=THROW_EXCEPTION recal_file=/seq/picard/D0ENMACXX/C1-210_2011-12-07_2011-12-18/1/Sage-75643/D0ENMACXX.1.recal_data.csv preserve_qscores_less_than=5 smoothing=1 max_quality_score=50 doNotWriteOriginalQuals=false no_pg_tag=false fail_with_no_eof_marker=false skipUQUpdate=false Covariates=[ReadGroupCovariate, QualityScoreCovariate, CycleCovariate, DinucCovariate]
@PG ID:GATK TableRecalibration.1 VN:1.3-14-g59da26a CL:default_read_group=null default_platform=null force_read_group=null force_platform=null window_size_nqs=5 homopolymer_nback=7 exception_if_no_tile=false solid_recal_mode=SET_Q_ZERO solid_nocall_strategy=THROW_EXCEPTION recal_file=/seq/picard/D0ENMACXX/C1-210_2011-12-07_2011-12-18/3/Sage-75643/D0ENMACXX.3.recal_data.csv preserve_qscores_less_than=5 smoothing=1 max_quality_score=50 doNotWriteOriginalQuals=false no_pg_tag=false fail_with_no_eof_marker=false skipUQUpdate=false Covariates=[ReadGroupCovariate, QualityScoreCovariate, CycleCovariate, DinucCovariate]
@PG ID:GATK TableRecalibration.2 VN:1.3-14-g59da26a CL:default_read_group=null default_platform=null force_read_group=null force_platform=null window_size_nqs=5 homopolymer_nback=7 exception_if_no_tile=false solid_recal_mode=SET_Q_ZERO solid_nocall_strategy=THROW_EXCEPTION recal_file=/seq/picard/D0ENMACXX/C1-210_2011-12-07_2011-12-18/7/Sage-75643/D0ENMACXX.7.recal_data.csv preserve_qscores_less_than=5 smoothing=1 max_quality_score=50 doNotWriteOriginalQuals=false no_pg_tag=false fail_with_no_eof_marker=false skipUQUpdate=false Covariates=[ReadGroupCovariate, QualityScoreCovariate, CycleCovariate, DinucCovariate]
@PG ID:GATK TableRecalibration.3 VN:1.3-14-g59da26a CL:default_read_group=null default_platform=null force_read_group=null force_platform=null window_size_nqs=5 homopolymer_nback=7 exception_if_no_tile=false solid_recal_mode=SET_Q_ZERO solid_nocall_strategy=THROW_EXCEPTION recal_file=/seq/picard/D0ENMACXX/C1-210_2011-12-07_2011-12-18/5/Sage-75643/D0ENMACXX.5.recal_data.csv preserve_qscores_less_than=5 smoothing=1 max_quality_score=50 doNotWriteOriginalQuals=false no_pg_tag=false fail_with_no_eof_marker=false skipUQUpdate=false Covariates=[ReadGroupCovariate, QualityScoreCovariate, CycleCovariate, DinucCovariate]
@PG ID:GATK TableRecalibration.4 VN:1.3-14-g59da26a CL:default_read_group=null default_platform=null force_read_group=null force_platform=null window_size_nqs=5 homopolymer_nback=7 exception_if_no_tile=false solid_recal_mode=SET_Q_ZERO solid_nocall_strategy=THROW_EXCEPTION recal_file=/seq/picard/D0ENMACXX/C1-210_2011-12-07_2011-12-18/6/Sage-75643/D0ENMACXX.6.recal_data.csv preserve_qscores_less_than=5 smoothing=1 max_quality_score=50 doNotWriteOriginalQuals=false no_pg_tag=false fail_with_no_eof_marker=false skipUQUpdate=false Covariates=[ReadGroupCovariate, QualityScoreCovariate, CycleCovariate, DinucCovariate]
@PG ID:GATK TableRecalibration.5 VN:1.3-14-g59da26a CL:default_read_group=null default_platform=null force_read_group=null force_platform=null window_size_nqs=5 homopolymer_nback=7 exception_if_no_tile=false solid_recal_mode=SET_Q_ZERO solid_nocall_strategy=THROW_EXCEPTION recal_file=/seq/picard/D0ENMACXX/C1-210_2011-12-07_2011-12-18/2/Sage-75643/D0ENMACXX.2.recal_data.csv preserve_qscores_less_than=5 smoothing=1 max_quality_score=50 doNotWriteOriginalQuals=false no_pg_tag=false fail_with_no_eof_marker=false skipUQUpdate=false Covariates=[ReadGroupCovariate, QualityScoreCovariate, CycleCovariate, DinucCovariate]
@PG ID:bwa PN:bwa VN:0.5.9-r16 CL:bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.1.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.1.Sage-75643.2.fastq.gz; bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.1.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.1.Sage-75643.1.fastq.gz; bwa sampe -P -f D0ENMACXX.1.Sage-75643.Homo_sapiens_assembly19.aligned_bwa.sam Homo_sapiens_assembly19.fasta D0ENMACXX.1.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.1.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.1.Sage-75643.1.fastq.gz D0ENMACXX.1.Sage-75643.2.fastq.gz
@PG ID:bwa.1 PN:bwa VN:0.5.9-r16 CL:bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.3.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.3.Sage-75643.2.fastq.gz; bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.3.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.3.Sage-75643.1.fastq.gz; bwa sampe -P -f D0ENMACXX.3.Sage-75643.Homo_sapiens_assembly19.aligned_bwa.sam Homo_sapiens_assembly19.fasta D0ENMACXX.3.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.3.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.3.Sage-75643.1.fastq.gz D0ENMACXX.3.Sage-75643.2.fastq.gz
@PG ID:bwa.2 PN:bwa VN:0.5.9-r16 CL:bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.7.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.7.Sage-75643.2.fastq.gz; bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.7.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.7.Sage-75643.1.fastq.gz; bwa sampe -P -f D0ENMACXX.7.Sage-75643.Homo_sapiens_assembly19.aligned_bwa.sam Homo_sapiens_assembly19.fasta D0ENMACXX.7.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.7.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.7.Sage-75643.1.fastq.gz D0ENMACXX.7.Sage-75643.2.fastq.gz
@PG ID:bwa.3 PN:bwa VN:0.5.9-r16 CL:bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.5.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.5.Sage-75643.2.fastq.gz; bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.5.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.5.Sage-75643.1.fastq.gz; bwa sampe -P -f D0ENMACXX.5.Sage-75643.Homo_sapiens_assembly19.aligned_bwa.sam Homo_sapiens_assembly19.fasta D0ENMACXX.5.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.5.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.5.Sage-75643.1.fastq.gz D0ENMACXX.5.Sage-75643.2.fastq.gz
@PG ID:bwa.4 PN:bwa VN:0.5.9-r16 CL:bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.6.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.6.Sage-75643.2.fastq.gz; bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.6.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.6.Sage-75643.1.fastq.gz; bwa sampe -P -f D0ENMACXX.6.Sage-75643.Homo_sapiens_assembly19.aligned_bwa.sam Homo_sapiens_assembly19.fasta D0ENMACXX.6.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.6.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.6.Sage-75643.1.fastq.gz D0ENMACXX.6.Sage-75643.2.fastq.gz
@PG ID:bwa.5 PN:bwa VN:0.5.9-r16 CL:bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.2.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.2.Sage-75643.2.fastq.gz; bwa aln Homo_sapiens_assembly19.fasta -q 5 -l 32 -k 2 -t 4 -o 1 -f D0ENMACXX.2.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.2.Sage-75643.1.fastq.gz; bwa sampe -P -f D0ENMACXX.2.Sage-75643.Homo_sapiens_assembly19.aligned_bwa.sam Homo_sapiens_assembly19.fasta D0ENMACXX.2.Sage-75643.Homo_sapiens_assembly19.1.sai D0ENMACXX.2.Sage-75643.Homo_sapiens_assembly19.2.sai D0ENMACXX.2.Sage-75643.1.fastq.gz D0ENMACXX.2.Sage-75643.2.fastq.gz
@PG ID:GATK PrintReads VN:3.3-0-g37228af CL:readGroup=null platform=null number=-1 sample_file=[] sample_name=[] simplify=false no_pg_tag=false
@CO aggregation_version=1

and some of the reads info like this;
D0ENMACXX111207:7:1202:2132:140703 163 1 10002 20 90M11S = 10181 274 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCAACCCTAACCCTAACCCATACTCTACCCAGTACCCTAACCCTAACCCTTACCCTAACCC =;?ACC?AEE.DDA7=7?CDDE
@70>;F################################################ AS:i:58 XS:i:54 XF:i:0 XE:i:1 NM:i:8 XT:i:1
D0ENMACXX111207:6:2202:6438:59394 163 1 10003 20 11M1I84M5S = 10183 280 ACCCTAACCCTAAACCCTNACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCGAACCCTAACCCTTACCC =>[email protected]
GBCD#[email protected]@@[email protected]=AAABEFCAABB*C=8=;F######################## AS:i:83 XS:i:79 XF:i:0 XE:i:1 NM:i:2 XT:i:1
@AFDCFEEGFDF
FFGECFEFGFCFFFGCDFFFGBAEFFGFCFFCF=DC<FE?DCFFBDEFF2?:F################################## AS:i:90 XS:i:80 XF:i:3 XE:i:2 NM:i:2
D0ENMACXX111207:3:2302:13424:148033 163 1 10011 22 101M = 10353 427 CCTAGCCCTAGCCCTAGCCCTAGCCCTAGCCCTAGCCCTAGCCCTAGCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAC <@[email protected]?GFFFE
[email protected]@CEGE<;GGCGED;@[email protected]?E;AC2;D2;BFDD############### AS:i:69 XS:i:63 XF:i:3 XE:i:2 NM:i:8
@=BCEEFCECCF;
F>@[email protected]>;[email protected][email protected]>DBF;[email protected]?/@8GCE<;<+:[email protected]@?>A?B?E####### AS:i:86 XS:i:76 XF:i:3 XE:i:1 NM:i:3

Actually I replaced the header of my bam file with the header of the original bam file.
The error was caused because the RG info does not accord with the RG info of the reads?

How can I solve this problems?

Tagged:

Answers

Sign In or Register to comment.