Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
As discussed in a blog post, GATK4 removes the realignment step:
As announced in the GATK v3.6 highlights, variant calling workflows that use HaplotypeCaller or MuTect2 now omit indel realignment. This change does not apply to workflows that call variants with UnifiedGenotyper or the original MuTect. We still recommend indel realignment for these legacy workflows.
I understand that HaplotypeCaller and MuTect2 do their own internal realignment, but I would like to examine the BAMs manually or feed them to other variant callers. It's nice to have the cleanest possible version. Technically, HaplotypeCaller can output a realigned BAM, but as the documentation states:
The assembled haplotypes and locally realigned reads will be written as BAM to this file if requested. Really for debugging purposes only. Note that the output here does not include uninformative reads so that not every input read is emitted to the bam.
Is there a recommended approach going forward? I am guessing you may have had an internal discussion about this. Should I keep the realignment step as GATK3 and move other steps to GATK4? That seems terribly inelegant and probably will eventually start causing issues.