We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

HaplotypeCaller Error: SAM/BAM/CRAM Invalid GZIP header

SV_WranglerSV_Wrangler Member
edited April 2018 in Ask the GATK team

This is my GATK (3.5-0-g36282e4) arguments Program Args:

-T HaplotypeCaller 
-R human_g1k_v37.22.fasta
-nct 16
-I ref.22.500x.bwamem.sorted.bqsr.bam
-I somatic_sim_af20_500x.bwamem.bqsr.bam
-I somatic_sim_het_500x.bwamem.sorted.bqsr.bam
-D All_20170403.vcf
-L 22
-o somatic_sim.hpcaller.22.vcf

This is the error message:

##### ERROR MESSAGE: SAM/BAM/CRAM file somatic_sim_het_500x.bwamem.sorted.bqsr.bam is malformed. .... Error details: Invalid GZIP header

The command worked fine the first time I ran it. However, I goofed and used the same ID in the read group for the het and af20 BAM file.

I ran

samtools addreplacerg ...
samtools view -H new_rg.bam >header.txt

Then I manually removed the old read group since I've noticed in the past, GATK will omit null genotypes for that sample

samtools reheader -P header.txt new_rg.bam >het.bam

Reindexed it from scratch and now I get this error.

I've used addreplacerg and HaplotypeCaller in the past successfully. However I never removed the older read groups.

samtools quickcheck -v het.bam 

Returns no error, so I'm at a lost here. Do BAM files typically have GZIP headers?

Thanks for any help or insight.

Best Answer


Sign In or Register to comment.