This section of the forum is no longer actively monitored. We are working on a support migration plan that we will share here shortly. Apologies for this inconvenience.
Adding GCloud labels to WDL / Cromwell tasks.
I am running my WDL on Google cloud via the
google alpha genomics pipelines command:
gcloud alpha genomics pipelines run \ --pipeline-file haplotypecaller.yaml \ --logging gs://my_bucket/logging \ --inputs-from-file WDL=haplotypecaller.wdl \ --inputs-from-file WORKFLOW_INPUTS=inputs.json \ --inputs-from-file WORKFLOW_OPTIONS=options.json \ --inputs WORKSPACE=gs://my_bucket/workspace \ --inputs OUTPUTS=gs://my_bucket/GATK_HaplotypeCaller/output --label project=sample
The YAML looks like this:
name: WDL Runner description: Run a workflow defined by a WDL file inputParameters: - name: WDL description: Workflow definition - name: WORKFLOW_INPUTS description: Workflow inputs - name: WORKFLOW_OPTIONS description: Workflow options - name: WORKSPACE description: Cloud Storage path for intermediate files - name: OUTPUTS description: Cloud Storage path for output files docker: imageName: gcr.io/broad-dsde-outreach/wdl_runner cmd: > /wdl_runner/wdl_runner.sh resources: minimumRamGb: 1
Everything runs just fine, and I get my expected output. However, I would like to track costs of each run. However, using the labels of the genomics pipeline only tracks the VM usage of the head node of cromwell, but not the underlying VMs. Is there a means to tracking the VMs spawned by wdl_runner with google cloud labels?