Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Calling whole-genome haplotypes for Chloroplast-captured Pooled Samples
I'm trying to call whole-chloroplast genome haplotypes for a pooled chloroplast-captured DNA sample from a non-model organism (no well-established variants). The reads are Illumina 100 bp PE reads, and have already undergone some clipping (adapter-trimming and quality control) and have been aligned to a reference genome. The pool represents 20 individuals. I want to know if there is a way in GATK to call the frequency of whole-genome haplotypes (or else, is there a way currently in existence elsewhere? ) If necessary, I can generate a panel of known haplotypes.
Currently, I have been using HaplotypeCaller to call SNPs and then filtering those by hand in Excel. I have already tried increasing the maximum active region size to larger than the whole reference genome (~150,000 bp), with a corresponding increase in the max reads per sample value, but this doesn't seem to have come up with whole-genome haplotypes.