The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!
Adding GCloud labels to WDL / Cromwell tasks.
I am running my WDL on Google cloud via the
google alpha genomics pipelines command:
gcloud alpha genomics pipelines run \ --pipeline-file haplotypecaller.yaml \ --logging gs://my_bucket/logging \ --inputs-from-file WDL=haplotypecaller.wdl \ --inputs-from-file WORKFLOW_INPUTS=inputs.json \ --inputs-from-file WORKFLOW_OPTIONS=options.json \ --inputs WORKSPACE=gs://my_bucket/workspace \ --inputs OUTPUTS=gs://my_bucket/GATK_HaplotypeCaller/output --label project=sample
The YAML looks like this:
name: WDL Runner description: Run a workflow defined by a WDL file inputParameters: - name: WDL description: Workflow definition - name: WORKFLOW_INPUTS description: Workflow inputs - name: WORKFLOW_OPTIONS description: Workflow options - name: WORKSPACE description: Cloud Storage path for intermediate files - name: OUTPUTS description: Cloud Storage path for output files docker: imageName: gcr.io/broad-dsde-outreach/wdl_runner cmd: > /wdl_runner/wdl_runner.sh resources: minimumRamGb: 1
Everything runs just fine, and I get my expected output. However, I would like to track costs of each run. However, using the labels of the genomics pipeline only tracks the VM usage of the head node of cromwell, but not the underlying VMs. Is there a means to tracking the VMs spawned by wdl_runner with google cloud labels?