This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Parameters for running GenomicsDB import
I have a system with about 8GB RAM. I've run HaplotypeCaller (-ERC GVCF) on specific genes of my interest using a .list file and have 109 **.g.vcf.gzs **of about 5-10 GB each. What would be the most optimal way to run GenomicsDBImport on these samples for Joint Calling ? Will I need to further subset these files into specific intervals or set a batch size ?
GATK version - 4.0.11, Java version-1.8
Optimal = Avoid errors, Maximise input samples, minimise computational load and minimise time in that order.