Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

HaplotypeCaller Error: SAM/BAM/CRAM Invalid GZIP header

SV_WranglerSV_Wrangler Member
edited April 2018 in Ask the GATK team

This is my GATK (3.5-0-g36282e4) arguments Program Args:

-T HaplotypeCaller 
-R human_g1k_v37.22.fasta
-nct 16
-I ref.22.500x.bwamem.sorted.bqsr.bam
-I somatic_sim_af20_500x.bwamem.bqsr.bam
-I somatic_sim_het_500x.bwamem.sorted.bqsr.bam
-D All_20170403.vcf
-L 22
-o somatic_sim.hpcaller.22.vcf

This is the error message:

##### ERROR MESSAGE: SAM/BAM/CRAM file somatic_sim_het_500x.bwamem.sorted.bqsr.bam is malformed. .... Error details: Invalid GZIP header

The command worked fine the first time I ran it. However, I goofed and used the same ID in the read group for the het and af20 BAM file.

I ran

samtools addreplacerg ...
samtools view -H new_rg.bam >header.txt

Then I manually removed the old read group since I've noticed in the past, GATK will omit null genotypes for that sample

samtools reheader -P header.txt new_rg.bam >het.bam

Reindexed it from scratch and now I get this error.

I've used addreplacerg and HaplotypeCaller in the past successfully. However I never removed the older read groups.

samtools quickcheck -v het.bam 

Returns no error, so I'm at a lost here. Do BAM files typically have GZIP headers?

Thanks for any help or insight.

Best Answer


Sign In or Register to comment.