To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Gathered bam could not be indexed

I runed indelRealigner of the same bam for 32 times each with a -L option which specify an interval, then I gathered the bam files of the outputs using picard GatherBamFiles(I found that the gather function of bam gather is a wrapper of picard GatherBamFiles). Then I have a large bam file, but when I tried to index the bam file, there is this error:

[E::bgzf_read] bgzf_read_block error -1 after 147 of 254 bytes
samtools index: "gathered.bam" is corrupted or unsorted

I dont know why is this, could anyone help me? thanks a lot.

Best Answers


  • yaohuyaohu beijingMember

    @Sheila said:

    Can you please post the exact commands you ran? There is an option --nWayOut that allows you to merge all the BAM files into one in the output of IndelRealigner. You can try using that instead of GatherBamFiles.


    Hi Sheila,

    Thanks for the quick reply, the problem I reported is solved using the latest picard. But there is another issue. I have run the following command for 32 times parallelly to save time, each with a small interval, the intervals would append to a entire reference.

    $JAVA -d64 -jar $GATK \
    -T IndelRealigner \
    -R $ref \
    -I $input \
    -L $interval \
    $known_string \
    -targetIntervals $target_interval \
    -o $output

    So after the runs finished I gathered the bams, and hope the output bam would be the same with the single original output bam which is run without interval specified, but when I diffed them using bam diff, there are big difference, is this normal? how could I distribute the compute without changing the result.

Sign In or Register to comment.