If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Using GATK on Arabidopsis data after EMS mutagenesis
I have been using GATK for a while now, but until now I've been analyzing Human samples and the current analysis is of Arabidopsis Thaliana data. Since I do not have databases of known indels and SNPs for this algorithm, I am following the suggested workflow without known sites.
I only have 2 samples in the analysis, a W.T parental strain and a sample which consists of a pool of 50 plants that underwent EMS mutagenesis. This treatment causes a large number of mutations, when each of the 50 plants in the pool can present different variations and the goal in the experiment is to find the one strong common homozygous mutation to the mutated plants, which is not present in the parental strain.
Since the data is a bit different than any other data that I had worked with, I would like to know if the standard workflow (running indel realignment without known sites, running HC, filtering the high confidence SNPs and than BQSR and HC again) is also recommended in this case and if so, should I apply any different cutoffs to obtain the high confidence SNPs set? Should I use the variants found in both samples to create the high confidence SNPs file? (since the mutagenesis sample will consist of a lot of mutation with a wide range of frequencies, that will not be present in the parental strain at all)
Thank you very much,