To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Can I retry job submissions that sometimes fail?

mmahmmah Member, Broadie
edited April 2017 in Ask the WDL team

I am using Cromwell v26, running on SLURM. The SLURM system I am using currently suffers from an annoying bug where some job submissions fail due to a socket timeout. This error is transient, and retrying seems to always result in success.

In the backend configuration file, this looks like:

submit = 
    sbatch --wrap "/bin/bash ${script}"

The sbatch command to SLURM is timing out.

The latest blog entry seems to indicate that Cromwell will retry some operations.
https://software.broadinstitute.org/wdl/blog?id=9362

Is the job submission operation a command that can be configured to retry?

Post edited by mmah on
Tagged:

Issue · Github
by Geraldine_VdAuwera

Issue Number
1986
State
open
Last Updated
Assignee
Array
Milestone
Array

Best Answer

Answers

Sign In or Register to comment.