To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Retrieving metadata for large workflows

ernfridernfrid Saint LouisMember

When requesting metadata for large workflows (thousands of executions), I'm repeatedly getting server timeouts after 20s. I believe this is a server-side timeout from cromwell and it seems likely that the workflow is too large to be effectively returned after the 20s timeout. I've found https://github.com/broadinstitute/cromwell/issues/2519 which suggests that this is a known problem. However, I'm curious if there are settings that can be manipulated to extend the 20s timeout and allow metadata to be returned if I'm patient?

Specifically, would adding something like this to my cromwell configuration be a good idea? Or is there a better way to retrieve the metadata? (Found this at: https://github.com/broadinstitute/cromwell/blob/07a567179647abf9966b15e519280d0d5aebe13d/src/bin/travis/resources/tes_centaur.conf)

spray.can {
  server {
    request-timeout = 40s
  }
  client {
    request-timeout = 40s
    connecting-timeout = 40s
  }
}

Compute Resources

  • Platform: Google Cloud Platform
  • Cromwell (version: "26-22fe860-SNAP") VM: 4 core, 26GB VM (n1-highmem-4) running Ubuntu 14.04.5 LTS.
  • Mysql VM: 2 core, 7.5GB VM (n1-standard-2) for mysql running Ubuntu 16.04.2 LTS.
Tagged:

Best Answer

Answers

Sign In or Register to comment.