Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Regarding ploidy in Haplotyple Caller for multiple replicates of pooled RNAseq
I am a little confused about the best practices for running Haplotyple Caller to call variants given the pooled nature of my study, any feedback is super appreciated!
I have 10 replicates of pooled, RNAseq data each for two samples (10 replicates for Sample A, 10 replicates for Sample B ). By pooled I mean each replicate has mRNA from 20 individuals all mixed together with no barcoding (population genetics study).
I had planned to just merge the bam files of these replicates, who have RGSMs of SampleA and SampleB, and simply run Haplotype Caller for Sample A and Sample B. However, that would mean I would set ploidy = 2 x 200. This seems very high!
Would it be better to run Haplotype Caller for each replicate separately, without merging the bam files and setting ploidy = 2 x 20, And then use some kind of tool such as CombineVariants to stack my vcf files into two samples for downstream comparisons?