Setting Up a Cromwell VM


My group is interested in setting up a dedicated Cromwell VM and would like advice on storage, cpu, and memory recommendations for this. We'll be running a number of different pipelines on this VM including a gatk pipeline. Would anyone be able to make a recommendation on any of the following?

OS: RHEL 6 vs 7? Any issues going with 7 with Cromwell.

How many CPUs? We can request up to 24.

How much memory? We can request up to 32gb.

How much storage space. We can have from 80gb to 500gb.

Obviously we don't want more of all of these things just for the sake of having more, but we want enough to run pretty much any pipeline we develop without problems. Any advice would be appreciated.



  Sheila Broad Institute

    @[email protected]

    I just moved your question to the WDL team. First, let me ask if it is possible for you to work on FireCloud? Setting up a dedicated Cromwell VM will cause unnecessary work if you can just do your work in the cloud.


  [email protected] Member

    We've set up our own cromwell instances before but on regular machines. We do plan to eventually move to the cloud but we aren't ready to do that just yet.

  Geraldine_VdAuwera Cambridge, MA

    Hi @[email protected].orge, to be frank your question is a bit too open-ended for our team to be able to answer it effectively. The answer depends so much on the volume and size of the analyses that you plan to run, and it's not something we're geared toward since we're now very much focused on cloud execution, where you just request what you need at any given time.

    That being said, our collaborators at Intel have been developing reference architectures to tackle exactly this problem. Let me find out if they could provide you with some assistance. I'll be in touch with you over email shortly.

  [email protected] Member
    edited June 2017

    Thank you Geraldine. We got some advice from someone in someone on the Cromwell team to go with 2 cpu and 8 gigs of RAM so we put in the request for that. We can always adjust if you do find out something different.

  Geraldine_VdAuwera Cambridge, MA

    Oh, I just realized that I misunderstood your question -- I thought you were looking for recommendations on provisioning enough machines for executing the workflows, but I realize now you were just looking to spec out the machine that only runs Cromwell. My bad! We get the first form of the question a lot so I jumped to conclusions... The advice you got sounds reasonable.

