Latest Release: 03/12/19
Release Notes can be found here.

Getting out-of-memory errors while running the worflow for germline short variant discovery

I'm trying to run the wdl posted on the gatk-workflows Github page, under the gatk4-germline-snps-indels repository. The wdl is "haplotypecaller-gvcf-gatk4.wdl"

I'm attempting to run this wdl locally on my computer. This wdl script makes use of GATK in a docker containers to execute tools such as HaplotypeCaller, and MergeVcfs. I'm using Cromwell in "run mode" to run the wdl script. I'm using the exact inputs listed in the haplotypecaller-gvcf-gatk4.hg38.wgs.inputs.json file.

The bam file is the NA12878_24RG_small.hg38.bam, which is about 5 gigs in size.
The fasta file is the Homo_sapiens_assembly38.fasta, which is about 3 gigs in size

Anytime I run this I eventually get out-of-memory errors. It seems like 50 GATK docker containers are getting spun up and run HaplotypeCaller in parallel. This is due to the number of interval lists declared in hg38_wgs_scattered_calling_intervals.txt I think?

I'm running it on a machine with 32G of RAM and 512GB of disk space. My questions are basically:

  1. How much RAM is needed to run this workflow?
  2. Should I set a limit on how much memory each docker container can use in the Cromwell configuration file, and if so, how much should I set it to?
  3. What should the Java heap size be set to?
  4. It looks like it is using the "scatter-gather" technique for paralyzation. Does this require me to set up a cluster of servers to run the workflow? I'm not sure if I can run it like this on just my local computer.

Any insight would be greatly appreciated. Thank you!

Best Answer

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @cgazzola

    This is a questions for the firecloud team. I am going to pass this on to them and they will help you out with it.

    Regards
    Bhanu

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    @cgazzola - I will check on the detailed limits of memory etc and get back to you asap!

  • cgazzolacgazzola Member

    Thank you so much!

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    @cgazzola
    I hope you were able to resolve your issue but since we haven't heard from you, we will be closing this ticket. However, if you find that you need further assistance please reach out and we will be more than happy to take another look.

Sign In or Register to comment.