To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Processing samples successivly

Hi guys,

i looked in the doc´s and in the forum and find nothing about running different sample successivly...what is the way to go ?

Writing a bash-script which starts the jobs one after another, running wdl in server mode and limit the number of concurrent wdl-piplines? Limit number of concurrent processes and start mutiple samples at the same time ? The problem is that only have limited resources and want to squeeze the best out of it. Also i dont want to start every job by hand.

Thank you!

Greetings EADG

Tagged:

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    We don't really have recommendations for that because it depends so much on your use case. One simple option is to use subworkflows which are now supported in Cromwell. Not sure about the best way to allocate resources -- you might need to experiment a bit and dig into Cromwell's options. We don't have much docs on this yet afaik but @KateVoss may have some pointers when she comes back from vacation.
  • EADGEADG KielMember

    Hi @KateN and @Geraldine_VdAuwera,

    since I have a nearly linear Workflow I decided to run in server-mode and it is amazing, I loving it :blush: . Some tips for other users who want to try out server mode:

    Edit Application.conf and add:

        system.max-concurrent-workflows = 2
        system.new-workflow-poll-rate = 120s
        system.max-workflow-launch-count = 2
    

    You can alter the number based on your maschine. With this configuration the wdl server will start 2 job at a time until he reaches 2 concurrent jobs and keep looking for new job every 2 minutes. (I only have a small Server :wink: ).

    Start Wdl in server mode

    java -Dconfig.file=/path/to/application.conf cromwell.jar server
    

    Standard Port is 8000 (you can change it in the application.conf File or via commandline: java -Dwebservice.port=8080 cromwell.jar )

    Submit a job via curl (usually installed under Debian).

    curl -v "localhost:8000/api/workflows/v1" -F wdlSource=@/path/to/your/workflow.wdl -F workflowInputs=@/path/to your/input.json
    

    You can check the status of your work via browser:

    ipToYourMaschine:8000/api/workflows/Workflow-Id/status
    

    Also an nice feature the gant-chart, to identify bottlenecks in your workflow

    ipToYourMaschine:8000/api/workflows/Workflow-Id/timing
    

    You can get the Workflow-id by the response of the wdl-server when you start a workflow or with the query function in your browser

    http://194.94.189.57:8000/api/workflows/v1/query
    

    Maybe set labels to your workflow for better identification.

    For docs look at: https://github.com/broadinstitute/cromwell#rest-api (where I got the most from)

    Greetings EADG

Sign In or Register to comment.