Very slow download of Files from Google Cloud Storage
We're using Cromwell with Google Cloud Storage and Google's Pipeline API and have observed that transferring files to GCS once a task outputs it's files is extremely fast (~13 seconds for 978 files). By contrast, transferring the files to a new task (and it's associated new VM) is extremely slow - about 532 seconds, which appears due to the way Cromwell copies files from GCS (issuing a single gsutil cp command for each and every file).
An example copy command of a single file:
sudo gsutil -q -m cp gs://test-bucket/wdl_runner/work/cs/16738fac-5146-4a3c-9cfa-d5ded7f199fc/call-demultiplex_and_sample_prep/glob-9c1244b6ebf22abec57cd494340f8c79/CL101_invASISTR_segment_0.fasta /mnt/local-disk/test-bucket/wdl_runner/work/cs/16738fac-5146-4a3c-9cfa-d5ded7f199fc/call-demultiplex_and_sample_prep/glob-9c1244b6ebf22abec57cd494340f8c79/CL101_invASISTR_segment_0.fasta
The -m for performing a multi-threaded copy is enabled, which is great, but has no effect since the command is only copying a single file. Is there any way to change the copy command so that it can download an entire bucket? Or some other way to make the file transfer more efficient?