This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
HaplotypeCaller Error: SAM/BAM/CRAM Invalid GZIP header
This is my GATK (3.5-0-g36282e4) arguments
-T HaplotypeCaller -R human_g1k_v37.22.fasta -nct 16 -I ref.22.500x.bwamem.sorted.bqsr.bam -I somatic_sim_af20_500x.bwamem.bqsr.bam -I somatic_sim_het_500x.bwamem.sorted.bqsr.bam -D All_20170403.vcf -L 22 -o somatic_sim.hpcaller.22.vcf
This is the error message:
##### ERROR MESSAGE: SAM/BAM/CRAM file somatic_sim_het_500x.bwamem.sorted.bqsr.bam is malformed. .... Error details: Invalid GZIP header
The command worked fine the first time I ran it. However, I goofed and used the same ID in the read group for the
af20 BAM file.
samtools addreplacerg ...
samtools view -H new_rg.bam >header.txt
Then I manually removed the old read group since I've noticed in the past, GATK will omit null genotypes for that sample
samtools reheader -P header.txt new_rg.bam >het.bam
Reindexed it from scratch and now I get this error.
I've used addreplacerg and HaplotypeCaller in the past successfully. However I never removed the older read groups.
samtools quickcheck -v het.bam
Returns no error, so I'm at a lost here. Do BAM files typically have GZIP headers?
Thanks for any help or insight.