The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

IndelRealigner input file

I am a first-time user of GATK and have spent some time now on trying to get the input bam files in the appropriate format. To run IndelRealigner, I have added ReadGroups, Reordered and Index my bam file with the respective Picard-Tools.

My command-line is the following:

java'pwd'/tmp -jar GenomeAnalysisTK.jar -I ./add_read_groups_reorder_index.bam -R ./genome.fa -T IndelRealigner -targetIntervals ./gatk.intervals -o ./*.bam -known ./Mills-1000G-indels.vcf --consensusDeterminationModel KNOWNS_ONLY -LOD 0.4

I get the following message:

SAM/BAM file /home/gp53/tophat2-merge-ctl-1st-2nd-readgroups-reorder-index.bam is malformed: SAM file doesn't have any read groups defined in the header.

My reads are paired-end aligned with TopHat2
I will appreciate your help on this.

Post edited by Geraldine_VdAuwera on

Best Answers


  • edited February 2013

    Hello Geraldine,
    Thanks for your help.

    I have checked my read groups and headers to make sure they look like the one specified in the GATK website (
    I am now trying to run RealignerTargetCreator and I get the following error:

    ERROR MESSAGE: SAM/BAM file SAMFileReader{/home/gp53/tophat2-eber-2nd-R1-readgroups-reorder.bam} is malformed: Read HWI-ST830:129:D1459ACXX:8:1208:6666:45578 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK

    My header looks like this:

    VN:1.0  SO:coordinate
    @SQ     SN:chrM LN:16571        UR:file:/home/gp53/bwa/genome.fa        M5:d2ed829b8a1628d16cbeee88e88e39eb
    @SQ     SN:chr1 LN:249250621    UR:file:/home/gp53/bwa/genome.fa        M5:1b22b98cdeb4a9304cb5d48026a85128
    @SQ     SN:chr2 LN:243199373    UR:file:/home/gp53/bwa/genome.fa        M5:a0d9851da00400dec1098a9255ac712e
    @SQ     SN:chr3 LN:198022430    UR:file:/home/gp53/bwa/genome.fa        M5:641e4338fa8d52a5b781bd2a2c08d3c3
    @SQ     SN:chr4 LN:191154276    UR:file:/home/gp53/bwa/genome.fa        M5:23dccd106897542ad87d2765d28a19a1
    @SQ     SN:chr5 LN:180915260    UR:file:/home/gp53/bwa/genome.fa        M5:0740173db9ffd264d728f32784845cd7
    @SQ     SN:chr6 LN:171115067    UR:file:/home/gp53/bwa/genome.fa        M5:1d3a93a248d92a729ee764823acbbc6b
    @SQ     SN:chr7 LN:159138663    UR:file:/home/gp53/bwa/genome.fa        M5:618366e953d6aaad97dbe4777c29375e
    @SQ     SN:chr8 LN:146364022    UR:file:/home/gp53/bwa/genome.fa        M5:96f514a9929e410c6651697bded59aec
    @SQ     SN:chr9 LN:141213431    UR:file:/home/gp53/bwa/genome.fa        M5:3e273117f15e0a400f01055d9f393768
    @SQ     SN:chr10        LN:135534747    UR:file:/home/gp53/bwa/genome.fa        M5:988c28e000e84c26d552359af1ea2e1d
    @SQ     SN:chr11        LN:135006516    UR:file:/home/gp53/bwa/genome.fa        M5:98c59049a2df285c76ffb1c6db8f8b96
    @SQ     SN:chr12        LN:133851895    UR:file:/home/gp53/bwa/genome.fa        M5:51851ac0e1a115847ad36449b0015864
    @SQ     SN:chr13        LN:115169878    UR:file:/home/gp53/bwa/genome.fa        M5:283f8d7892baa81b510a015719ca7b0b
    @SQ     SN:chr14        LN:107349540    UR:file:/home/gp53/bwa/genome.fa        M5:98f3cae32b2a2e9524bc19813927542e
    @SQ     SN:chr15        LN:102531392    UR:file:/home/gp53/bwa/genome.fa        M5:e5645a794a8238215b2cd77acb95a078
    @SQ     SN:chr16        LN:90354753     UR:file:/home/gp53/bwa/genome.fa        M5:fc9b1a7b42b97a864f56b348b06095e6
    @SQ     SN:chr17        LN:81195210     UR:file:/home/gp53/bwa/genome.fa        M5:351f64d4f4f9ddd45b35336ad97aa6de
    @SQ     SN:chr18        LN:78077248     UR:file:/home/gp53/bwa/genome.fa        M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c
    @SQ     SN:chr19        LN:59128983     UR:file:/home/gp53/bwa/genome.fa        M5:1aacd71f30db8e561810913e0b72636d
    @SQ     SN:chr20        LN:63025520     UR:file:/home/gp53/bwa/genome.fa        M5:0dec9660ec1efaaf33281c0d5ea2560f
    @SQ     SN:chr21        LN:48129895     UR:file:/home/gp53/bwa/genome.fa        M5:2979a6085bfe28e3ad6f552f361ed74d
    @SQ     SN:chr22        LN:51304566     UR:file:/home/gp53/bwa/genome.fa        M5:a718acaa6135fdca8357d5bfe94211dd
    @SQ     SN:chrX LN:155270560    UR:file:/home/gp53/bwa/genome.fa        M5:7e0e2e580297b7764e31dbc80c2540dd
    @SQ     SN:chrY LN:59373566     UR:file:/home/gp53/bwa/genome.fa        M5:1e86411d73e6f00a10590f976be01623
    @RG     ID:null PL:illumina     PU:single_lane  LB:unstranded   SM:tophat-eber-2nd-R1
    @PG     ID:TopHat       VN:2.0.5        CL:/usr/local/bin/tophat2 -p 16 -g 1 -z pigz -G /home/gp53/tophat/genes.gtf --no-novel-juncs -o tophat-eber-2nd-R1 /home/administrator/Bowtie2Index/genome /media/Elements/Genaro/input/eber-2nd-R1.fastq

    Also, the GATK guide indicates that I have an indexed file, but then GATK-2.3-9 wont accept indexed bam files.
    I would appreciate your help on this.

    Post edited by Geraldine_VdAuwera on
  • Hi Geraldine,
    I got my issue fixed, I think.
    I have the RealignerTargetCreator running now in both BWA and TopHat2 alignments.The one thing I changed is to leave the ID=string option as default=1 in AddOrReplaceReadGroups.jar.
    That pretty much eliminated the recurring error: Read HWI-ST830:129:D1459ACXX:8:1208:6666:45578 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK.
    The other problem I had was that I was confused/unaware that GATK would go and look for the bam.bai file given a bam file in the input.
    I am not a bioinformatician by training, so this was not obvious to me.
    Thanks again for your help.

Sign In or Register to comment.