Update: July 26, 2019
This section of the forum is no longer actively monitored. We are working on a support migration plan that we will share here shortly. Apologies for this inconvenience.

String vs. File mistake caused Cromwell 26 workflow to hang

gauthiergauthier Member, Broadie, Moderator, Dev admin

I wrote a bad WDL (that does successfully validate, BTW) that persisted in the "running" state according to the metadata from the Swagger UI without generating logs or output. After I ran out of patience, Megan the Magnificent worked some ssh magic and found an error in the log:

2017-07-26 18:44:43 [cromwell-system-akka.actor.default-dispatcher-3233] WARN  c.b.i.j.s.JesPollingActor - 1 failures (from 1 requests) fetching JES statuses: {“domain”:“global”,“message”:“Pipeline 7366868947779117636: Unable to evaluate parameters: parameter \“GenotypeWithOldQual.intervals-0\” has invalid value: chr5:43185398-43975195",“reason”:“badRequest”}

The workflow never failed -- I had to abort it. After reviewing the WDL, I found that I passed a string (that was not a filename) to a task that expected a File. Admittedly that's my fault and I can even understand why it validated, but the fact that the failure above didn't get passed on to fail the workflow kept me waiting for quite a while. After fixing the type in the task (File intervals -> String intervals), the workflow runs successfully.

These shenanigans went down on the dsde-methods cromwell 26 via the Swagger UI. Here's a simplified version of the failing WDL:

task GenotypeWithOldQual {
  File input_vcf
  File input_vcf_index
  File ref_fasta
  File ref_fasta_index
  File intervals

  String output_vcf_name

  command <<<
    ./gatk-launch GenotypeGVCFs -V ${input_vcf} -R ${ref_fasta} \
    -L ${intervals} -O ${output_vcf_name} 2> ${output_vcf_name}.log    
  >>>
  output {
    File output_vcf = "${output_vcf_name}"
    File output_vcf_index = "${output_vcf_name}.tbi"
    File output_log = "${output_vcf_name}.log"
  }
  runtime {
    docker: "broadinstitute/gatk"
    memory: "2 GB"
    cpu: 1
    disks: "local-disk 50 HDD"
    preemptible: 3
  }
}

workflow CompareQualRuntimes {
  File ref_fasta
  File ref_fasta_index
  File input_vcf
  File input_vcf_index
  String intervals
  String output_vcf_basename

  call GenotypeWithOldQual as GTOld1 {
    input:
      input_vcf = input_vcf,
      input_vcf_index = input_vcf_index,
      ref_fasta = ref_fasta,
      ref_fasta_index = ref_fasta_index,
      intervals = intervals,
      output_vcf_name = "${output_vcf_basename}.oldQual.trial1.vcf.gz"
  }
}

...and its json:

{
  "CompareQualRuntimes.ref_fasta": "gs://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta",
  "CompareQualRuntimes.output_vcf_basename": "newQualBenchmarking_fastShard.hg38",
  "CompareQualRuntimes.ref_fasta_index": "gs://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.fai",
  "CompareQualRuntimes.input_vcf": "gs://broad-dsde-methods/louisb/genotypegvcfs_data/shard-314.vcf.gz",
  "CompareQualRuntimes.intervals": "chr5:43185398-43975195",
  "CompareQualRuntimes.input_vcf_index": "gs://broad-dsde-methods/louisb/genotypegvcfs_data/shard-314.vcf.gz.tbi"
}

Thanks!
Laura

Best Answer

Answers

Sign In or Register to comment.