bwa mem and GATK error

meharmehar Member
edited August 2014 in Ask the GATK team

Hi,

I have used bwa mem to align with the below command:

bwa mem -R '@RG\tID:X\tLB:Y\tSM:Z\tPL:ILLUMINA' ref.fa seq1.fastq seq2.fastq | samtools view -bS - > alignment.bam

Then used GATK lastest version to create interval for realignment around indels using RealignTargetCreator which gives the error as shown below:

Command:

/apps/technic/jdk1.7.0_45/bin/java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R ref.fa -I alignment.bam -o realign.intervals
Picked up _JAVA_OPTIONS: -Xmx10G
INFO 22:38:37,624 HelpFormatter - --------------------------------------------------------------------------------
INFO 22:38:37,627 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.2-2-gec30cee, Compiled 2014/07/17 15:22:03
INFO 22:38:37,627 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 22:38:37,628 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 22:38:37,634 HelpFormatter - Program Args: -T RealignerTargetCreator -R ref.fa -I alignment.bam -o realign.intervals
INFO 22:38:37,783 HelpFormatter - Executing as [email protected] on Linux 2.6.18-371.3.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_45-b18.
INFO 22:38:37,784 HelpFormatter - Date/Time: 2014/08/13 22:38:37
INFO 22:38:37,784 HelpFormatter - --------------------------------------------------------------------------------
INFO 22:38:37,784 HelpFormatter - --------------------------------------------------------------------------------
INFO 22:38:38,466 GenomeAnalysisEngine - Strictness is SILENT
INFO 22:38:38,604 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 22:38:38,617 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 22:38:40,292 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.2-2-gec30cee):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: SAM/BAM file alignment.bam is malformed: Error parsing SAM header. Problem parsing @PG key:value pair ID:X clashes with ID:bwa. Line:
ERROR @PG ID:bwa PN:bwa VN:0.7.10-r789 CL:bwa mem -R @RG ID:X LB:Y SM:Z PL:ILLUMINA ref.fa R1_001_filtered.fastq R2_001_filtered.fastq; File alignment.bam; Line number 43

I have recently updated to the latest version and encountered with this error which did not occur with the previous version.
Could someone give suggestions/ideas to fix this?

Thanks in advance!!

Answers

  • jamjam Member

    that looks annoying, doesn't appear to be covered by the sam spec as I interpret it. maybe bwa should put the command line on a comment line. or gatk could quit checking for an ID once it's found one, thought he spec doesn't say it should.

    since you asked for ideas... you need to remove or mangle the second ID key value pair from the PG line, the part that's added from your bwa mem command line. I'd just change it to cl_ID:X or somthing. so...

    (apologies to gatk, i'm more used to samtools for this so .. )

    samtools view -H alignment.bam > alignment.header

    [ edit the alignment.header file, change @PG line to fix the second ID tag "@RG ID:X" to "@RG cl_ID:X" and save it ]

    samtools reheader alignment.header alignment.bam > alignment.fixed.bam

    then try with alignment.fixed.bam

  • jamjam Member

    I tested some of mine with this, and I think you have a problem with your options to bwa mem actually, the PG line should read

    @PG ID:bwa PN:bwa VN:0.7.10-r789 CL:bwa mem -R @RG\tID:X\tLB:Y\tSM:Z\tPL:ILLUMINA ref.fa R1_001_filtered.fastq R2_001_filtered.fastq; File alignment.bam; Line number 43

    make sure you ran bwa mem with single quotes around the -R option as in the help
    -R '@RG\tID:foo\tSM:bar'
    and not double quotes where the tab(\t) gets expanded.

    perhaps that's it?

  • I ran bwa 0.7.12 with single quotes around -R option but still have expanded tab

Sign In or Register to comment.