Attention:
The front line support team will be unavailable to answer questions until May 27th 2019 as we are celebrating Memorial Day. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

String vs. File mistake caused Cromwell 26 workflow to hang

gauthiergauthier Member, Broadie, Moderator, Dev admin

I wrote a bad WDL (that does successfully validate, BTW) that persisted in the "running" state according to the metadata from the Swagger UI without generating logs or output. After I ran out of patience, Megan the Magnificent worked some ssh magic and found an error in the log:

2017-07-26 18:44:43 [cromwell-system-akka.actor.default-dispatcher-3233] WARN  c.b.i.j.s.JesPollingActor - 1 failures (from 1 requests) fetching JES statuses: {“domain”:“global”,“message”:“Pipeline 7366868947779117636: Unable to evaluate parameters: parameter \“GenotypeWithOldQual.intervals-0\” has invalid value: chr5:43185398-43975195",“reason”:“badRequest”}

The workflow never failed -- I had to abort it. After reviewing the WDL, I found that I passed a string (that was not a filename) to a task that expected a File. Admittedly that's my fault and I can even understand why it validated, but the fact that the failure above didn't get passed on to fail the workflow kept me waiting for quite a while. After fixing the type in the task (File intervals -> String intervals), the workflow runs successfully.

These shenanigans went down on the dsde-methods cromwell 26 via the Swagger UI. Here's a simplified version of the failing WDL:

task GenotypeWithOldQual {
  File input_vcf
  File input_vcf_index
  File ref_fasta
  File ref_fasta_index
  File intervals

  String output_vcf_name

  command <<<
    ./gatk-launch GenotypeGVCFs -V ${input_vcf} -R ${ref_fasta} \
    -L ${intervals} -O ${output_vcf_name} 2> ${output_vcf_name}.log    
  >>>
  output {
    File output_vcf = "${output_vcf_name}"
    File output_vcf_index = "${output_vcf_name}.tbi"
    File output_log = "${output_vcf_name}.log"
  }
  runtime {
    docker: "broadinstitute/gatk"
    memory: "2 GB"
    cpu: 1
    disks: "local-disk 50 HDD"
    preemptible: 3
  }
}

workflow CompareQualRuntimes {
  File ref_fasta
  File ref_fasta_index
  File input_vcf
  File input_vcf_index
  String intervals
  String output_vcf_basename

  call GenotypeWithOldQual as GTOld1 {
    input:
      input_vcf = input_vcf,
      input_vcf_index = input_vcf_index,
      ref_fasta = ref_fasta,
      ref_fasta_index = ref_fasta_index,
      intervals = intervals,
      output_vcf_name = "${output_vcf_basename}.oldQual.trial1.vcf.gz"
  }
}

...and its json:

{
  "CompareQualRuntimes.ref_fasta": "gs://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta",
  "CompareQualRuntimes.output_vcf_basename": "newQualBenchmarking_fastShard.hg38",
  "CompareQualRuntimes.ref_fasta_index": "gs://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.fai",
  "CompareQualRuntimes.input_vcf": "gs://broad-dsde-methods/louisb/genotypegvcfs_data/shard-314.vcf.gz",
  "CompareQualRuntimes.intervals": "chr5:43185398-43975195",
  "CompareQualRuntimes.input_vcf_index": "gs://broad-dsde-methods/louisb/genotypegvcfs_data/shard-314.vcf.gz.tbi"
}

Thanks!
Laura

Best Answer

Answers

Sign In or Register to comment.