We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

DepthOfCoverage memory usage

igorigor New YorkMember ✭✭
edited November 2015 in Ask the GATK team

Is there a way to manage DepthOfCoverage memory usage? I am having problems when I give it a large intervals file. I can successfully run other tools like RealignerTargetCreator, IndelRealigner, and BaseRecalibrator, which seem like they would be more memory-intensive. I can also run DepthOfCoverage with --omitIntervalStatistics --omitLocusTable --omitDepthOutputAtEachBase. However, running it with just --omitDepthOutputAtEachBase gives me a memory error:
##### ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program. See the -Xmx JVM argument to adjust the maximum heap size provided to Java
Is there any way to optimize that?


  • tommycarstensentommycarstensen United KingdomMember ✭✭✭

    You can give GATK 4GB of memory like this:

    java7 -Xmx4000m GATK.jar ...
  • igorigor New YorkMember ✭✭

    I already assign certain amounts of memory for all of the tools, which is much larger than the BAM sizes. I can do that. My question is why DepthOfCoverage is only one that has any problems and only with certain parameters.

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    I suspect you have some regions of very high coverage. DepthOfCoverage performs badly on those regions because it does not apply any downsampling, but other tools do not perform badly on those regions because they do apply downsampling.


  • igorigor New YorkMember ✭✭

    I explicitly turn off downsampling on all tools. Also, DepthOfCoverage works, but only with certain parameters. That is the troubling part for me.

  • pdexheimerpdexheimer Member ✭✭✭✭

    So what you're really saying is that asking DoC to calculate Interval Statistics and/or Locus Tables is taking too much memory? This seems reasonable to me. The help for --omitLocusTable says that you're deciding whether to calculate "per-sample per-depth counts of loci". You've not described your data at all, but it's not hard to imagine that computing a depth histogram for every sample simultaneously might require some memory.

  • igorigor New YorkMember ✭✭

    I am only using one sample at a time.

Sign In or Register to comment.