If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
RNA-seq variant calling and merging of sample replicates
Hello, and thanks for making all the GATK tools! I have recently started to try my hand at variant calling of my RNA-seq data, following the GATK Best Practices more or less verbatim, only excluding indel alignment (because I am only interested in SNPs at this point) and the BQSR (partly because I have very high quality data, but mostly because I couldn't get it to work in the workflow).
I have three replicates for each of my samples, and my question is where, if at all, I should merge the data from them. I am not sure if I can (or even should!) merge the FASTQ files before the alignment step, or merge the aligned BAM files, or something else. I read that for aligners such as BWA the options are (more or less) equivalent, but seeing as the RNA-seq Best Practice workflow using STAR... What would be the "correct" way to do it, if at all? How would merging (at some level) affect the speed of the workflow, and can I optimise that somehow?
If it's a bad idea to do merging, how would I determine the "true" variant from my three resulting VCF-files at the end, for cases where they differ?