Update: July 26, 2019
This section of the forum is no longer actively monitored. We are working on a support migration plan that we will share here shortly. Apologies for this inconvenience.

Predefined vs Custom Machine types

ernfridernfrid Saint LouisMember
edited September 2018 in Ask the Cromwell + WDL Team

I've recently updated to Cromwell 34 with the v2alpha pipelines API and I now see that all of my tasks are being submitted as custom machine types.

For example, I recently ran a workflow where one of my tasks requests 16 cores and 15GB of RAM. This resulted in a custom machine type with 16 cores and 14848 MiB of RAM when it could/should have easily been an n1-highcpu-16 type. In fact, I have yet to observe a predefined machine type being spawned since I updated.

Is this expected behavior? If I'd like to spawn tasks using the predefined types, what would I need to do?



  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    Hey @ernfrid,

    Cromwell purposefully uses requests custom machine types with the v2alpha API -- the reason is described in the migration doc here:

    Standard machine type names (for example, n1-standard-1) or custom types (for example, custom-1-4096) can be used. If a custom type is specified that maps directly onto a cheaper standard type, Compute Engine will use the standard type automatically.

    By always requesting custom machines, we hope to get the best priced machines possible, or if the cpu/memory requirements of a custom machine match that of a standard -- Pipelines API opts to use a standard machine automatically. Let me know if that helps. Thanks!

  • ernfridernfrid Saint LouisMember

    Thanks @Ruchi. I've yet to this happen, but let me try directly and by testing different units. I suspect I'm falling prey to misspecification of the requirements such that they are very close to a predefined type, but not exactly identical.

  • ernfridernfrid Saint LouisMember

    Hi @Ruchi,
    I managed to test this today and the results seem uneven. It seems it is not always possible to obtain a custom machine type either due to rounding in Cromwell or to issues in the underlying Pipelines API. Details on my tests are below. Is there a way to specify the memory requirements in Cromwell such that they won't be adjusted and miss what is needed to use a predefined type?



    I was able to list the exact memory requirements in MiB of the predefined machine types here: https://cloud.google.com/compute/docs/reference/rest/v1/machineTypes/list . I then attempted to target various ones using both Cromwell and the gcloud command-line interface to the pipelines API.

    1. g1-small - When specifying either 1.7 GiB or 1740 MiB, Cromwell rounded this up to a request for 1792.0 MiB and thus I got a custom machine type. The gcloud CLI submitted an operation to the Pipelines API, but this operation failed with the error: "Execution failed: creating instance: inserting instance: Invalid value for field 'resource.machineType': 'zones/us-central1-f/machineTypes/custom-1-1740'. Memory should be a multiple of 256MiB in zone us-central1-f for custom machine type, while 1740MiB is requested." It appears to me that it isn't currently possible to specify a g1-small instance via the Pipelines API.
    2. n1-standard-1 - Worked successfully in both cases (as 3.7 GiB or 3840 MiB). Note that the logs seem to indicate a custom machine, but the machine type of the spawned VMs is, in fact, n1-standard-1. Note that this also worked with gcloud (specifying --memory 3.840)
    3. n1-highcpu-4 - Failed to grab a predefined instance with Cromwell (3.6 GiB or 3686 MiB), but succeeded with gcloud (--memory 3.686). For Cromwell, it looks like the memory request was rounded to 3840.0 MiB in both cases. Note that precision matters here. With gcloud, specifying 3.6 generates an error.

    A few notes and oddities for anyone else coming across this:
    1. The memory specifications for the predefined types are actually in mebibytes (MiB). You likely will need to figure out the exact number of MiB if you want this to work.
    2. The gcloud command line appears to simply divide the memory request by 1000 to generate a requested number of MiB.

Sign In or Register to comment.