Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
HaplotypeCaller Error: SAM/BAM/CRAM Invalid GZIP header
This is my GATK (3.5-0-g36282e4) arguments
-T HaplotypeCaller -R human_g1k_v37.22.fasta -nct 16 -I ref.22.500x.bwamem.sorted.bqsr.bam -I somatic_sim_af20_500x.bwamem.bqsr.bam -I somatic_sim_het_500x.bwamem.sorted.bqsr.bam -D All_20170403.vcf -L 22 -o somatic_sim.hpcaller.22.vcf
This is the error message:
##### ERROR MESSAGE: SAM/BAM/CRAM file somatic_sim_het_500x.bwamem.sorted.bqsr.bam is malformed. .... Error details: Invalid GZIP header
The command worked fine the first time I ran it. However, I goofed and used the same ID in the read group for the
af20 BAM file.
samtools addreplacerg ...
samtools view -H new_rg.bam >header.txt
Then I manually removed the old read group since I've noticed in the past, GATK will omit null genotypes for that sample
samtools reheader -P header.txt new_rg.bam >het.bam
Reindexed it from scratch and now I get this error.
I've used addreplacerg and HaplotypeCaller in the past successfully. However I never removed the older read groups.
samtools quickcheck -v het.bam
Returns no error, so I'm at a lost here. Do BAM files typically have GZIP headers?
Thanks for any help or insight.