Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
What intermediate file to keep, and what to delete?
I am running an RNA-seq project, and per Best Practices I at least have the following intermediate files:
1. BAMs after MarkDuplicates,
2. The output of SplitNCigarReads using (1) as input,
3. BQSR tables, one per sample each pass, and
4. The BAM file after applying the second-pass BQSR table.
My institution's storage is extremely limited, and I'd need to delete some of these files. Which of these, you think, should I keep? Personally I would keep the last one at least, since both HaplotypeCaller and MuTect2 would take that as input, but what else?