Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
pooled sequencing indel realignment creates different results in different pools.
In the attached screenshot of IGV, there are 6 BAM panels. The top three (wider) BAM panels are three pools after indel realignment. The bottom three (thinner) BAM panels are those same three pools just prior to doing indel realignment (but after doing dedup).
I'm looking for SNPs which show major allele frequency differences between these three pools. The pools are about 20x depth and contain 20-40 individuals.
As you can see, in the realigned pools, it looks like a very promising SNP on paper where High and Ref pools have the reference A allele and the Low pool has a T allele nearly fixed for that position. However, what is clear from the bottom three panels is that this is just a false positive caused by differential indel realignment among these three pools.
I examined 38 highly differentiated SNPs by eye using IGV and 13 of them were clear false positives caused by indel realignment. Has this been observed before that you know of? Is there any a priori reason to avoid performing indel realignment on pooled sequencing data?