This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
We're currently looking to call variants using GATK from a collection of bacterial genomes. However, there are several highly variable regions that we don't wish to analyse, such as phage and transposons. Where in your calling variants pipleline for GATK would you recommend masking these regions? I've seen several discussions online where people state that you should not mask the reference genome for use with bwa, in case of inadvertently making reads align to a region they otherwise wouldn't. So is it possible to mask specific regions in the bam files after aligning/before variant calling?