The frontline support team will be slow on the forum because we are occupied with the GATK Workshop on March 21st and 22nd 2019. We will be back and more available to answer questions on the forum on March 25th 2019.
Reducing memory footprint for GenotypeGVCFs (gatk 126.96.36.199). TMP_DIR an issue?
I'm running GenotypeGVCFs 188.8.131.52 on a SLURM cluster, and having great difficulty determining how much memory is needed -- i constantly run out. If someone could provide a predictor for memory footprint, that would be useful. Moderate amounts such as 120GB or 140GB RAM are insufficient -- even if my jobs' working sets are in the hundreds-of-MBs ballpark. Asking for more RAM causes long job scheduling delays.
My setup: Following GATK best practices, I first run HaplotypeCaller in GVCF mode for each sample, then import my ~2400 samples in batches of 200 into a GenomicsDB over a 1MBp region. The last step, GenotypeGVCFs, just blows up in RAM usage. I'm working with sunflower DNA (>3Gbp genome). diploid. data's been aligned from paired-end illumina sequencing, filtered, markdup'd, sorted. 5x coverage on average. It's naturally messy however.
Things I've tried to reduce the memory footprint:
- I've tried limiting the java heap size with
-Xmx to 4GB less than my allocation limit. e.g. if I ask for a 140GB job, I'll give 136GB to java -- I figured that would be a very conservative buffer to take OOMKILL out of the picture.
- Reducing the working set -- i.e. splitting the region of interest of each unit of work in progressively finer intervals. I'm down to 1Mbp regions now, which is already very inconvenient.
- I'm not even using any of the
-nt options. Just using the default single data processing thread.
- I haven't tried
--use-new-qual yet, but I plan to (and I'll report back).
It's possible something outside Java might be eating up RAM. Can someone confirm or deny if GenotypeGVCFs with GenomicsDB inputs writes to typically-RAM-backed filesystems? Writing to tmpfs (such as /tmp), or /dev/shm counts towards my job's memory limit, so that should be avoided. The documentation isn't clear as to what exactly
--TMP_DIRachieves or even if it's used at all. Maybe there are other java defines
-D I could set?