To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

[WDL][Cromwell] Submitting a workflow with a subworkflow to the cloud.


I am having an issue with how to appropriately submit a WDL workflow to the cloud when a subworkflow is involved, which I have not been able to find anything in the forums or spec sheets to explain this. I am attempting to run a variant calling workflow across a large cohort of whole exome data. The workflow is constructed as a scatter-gather of the individual samples. However, to parallelize the variant calling step with HaplotypeCaller (with intervals), I needed a nested scatter-gather. To do this, a subworkflow was invoked.

Currently, I am submitting the jobs with the following command:

gcloud alpha genomics pipelines run \
  --pipeline-file wdl_pipeline.yaml \
  --zones us-east1-b \
  --logging gs://dfci-testgenomes/logging \
  --inputs-from-file  \
  --inputs-from-file \
  --inputs-from-file \
  --inputs WORKSPACE=gs://dfci-testgenomes/workspace \
  --inputs OUTPUTS=gs://dfci-testgenomes/outputs

The following files are located in the same directory as the invocation of the above command:
4) is my sub-workflow. In my main workflow (, it is imported and called as follows:

import "" as HaplotypeCaller
call HaplotypeCaller.HaplotypeCallerAndGatherVCFs {
        input_bam = ApplyBQSR.recalibrated_bam,
        input_bam_index = ApplyBQSR.recalibrated_bam_index,
        ref_fasta = ref_fasta,
        ref_fasta_index = ref_fasta_index,
        ref_dict = ref_dict,
        gvcf_basename = inputs[1],
        scattered_calling_intervals = scattered_calling_intervals

However, I get an error from Cromwell upon submission that reads as:

2017-03-06 22:30:52,742 ERROR - WorkflowManagerActor: Workflow failed submission: Workflow input processing failed.
Unable to load namespace from workflow: /wdl_runner/
cromwell.engine.workflow.MaterializeWorkflowDescriptorActor$$anonfun$receive$1$$anon$1: Workflow input processing failed.
Unable to load namespace from workflow: /wdl_runner/
    at cromwell.engine.workflow.MaterializeWorkflowDescriptorActor$$anonfun$receive$1.applyOrElse(MaterializeWorkflowDescriptorActor.scala:69) ~[cromwell.jar:0.19]
    at$class.aroundReceive(Actor.scala:467) ~[cromwell.jar:0.19]
    at cromwell.engine.workflow.MaterializeWorkflowDescriptorActor.aroundReceive(MaterializeWorkflowDescriptorActor.scala:59) ~[cromwell.jar:0.19]
    at [cromwell.jar:0.19]
    at [cromwell.jar:0.19]
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) [cromwell.jar:0.19]
    at [cromwell.jar:0.19]
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) [cromwell.jar:0.19]
    at scala.concurrent.forkjoin.ForkJoinTask.doExec( [cromwell.jar:0.19]
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask( [cromwell.jar:0.19]
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker( [cromwell.jar:0.19]
    at [cromwell.jar:0.19]

It seems the issue is at identifying where the sub-workflow is located. What would be the appropriate means to submit this workflow to gcloud with the sub-workflow. Please let me know if there is any further information I can provide.

-- Derrick DeConti


Best Answer


  • decontideconti DFCIMember

    Thanks, Geraldine.

    Hopefully that support will come about soon. Meanwhile, I think I can get around the issue by mapping the files to intervals, then scattering over a TSV with the files and the intervals.

Sign In or Register to comment.