Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Parameters for running GenomicsDB import
I have a system with about 8GB RAM. I've run HaplotypeCaller (-ERC GVCF) on specific genes of my interest using a .list file and have 109 **.g.vcf.gzs **of about 5-10 GB each. What would be the most optimal way to run GenomicsDBImport on these samples for Joint Calling ? Will I need to further subset these files into specific intervals or set a batch size ?
GATK version - 4.0.11, Java version-1.8
Optimal = Avoid errors, Maximise input samples, minimise computational load and minimise time in that order.