Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.

Best way to pass a parameter into a method

I'm setting up a method that is based on an executable that has multiple functions. When run from the command line, the user chooses the function by passing in a string. For example: bamUtil trimBam -L 30 tells bamUtil to trim 30 bases from each read in a bam file.

What is the best way to provide these instructions to the executable in FireCloud? I could hardwire them into the docker image, but then we'd need a new image each time we wanted to perform a different function or make a minor adjustment. Alternatively, the parameters could be included as attributes of the sample (since the bam file name will be an attribute of the sample). Then we would need to change the sample attributes each time we wanted to run a variation.

My method configuration template has a line that looks like this:
myMethod.bamUtilTask.tool:(String) Expression
because I've built the docker image to accept the function as a variable called "tool".

Ideally, I'd like to set the value of the "tool" string here in the method configuration. But when I try to do that I get an error message.

Thank you

Best Answer

Answers

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    Hi Dave

    Your idea of making the tool (for example "trimBam") of bamUtil to use a string seems like a good idea. You haven't shared much detail of the WDL or your docker so it's hard to characterize well what is/isn't going on and is/isn't working.

    It is possible to do something like what you want. For example, I made a small WDL "dynaWDL" arbitrary commands can be run. The command itself is a method config input.

    You mention docker. Note that Cromwell/Firecloud has some requirements by the docker specifically that a shell must be invokable and that the entrypoint/CMD of the image should not prevent the shell from being invoked. Consider checking out these links/posts which might be helpful?

    http://gatkforums.broadinstitute.org/wdl/discussion/7959/docker-usable-by-cromwell-what-are-the-requirements-bin-bash-nor-bin-sh-usable#latest

    http://gatkforums.broadinstitute.org/wdl/discussion/8054/image-entrypoint-requirement#latest

    One thing I've done in the past to make a tool runnable is start a new Dockerfile/image, base it off the image of the tool "FROM THE_TOOL" and then the only other line in the file is either a CMD line or an ENTRYPOINT line to make a shell invokable/usable by firecloud.

    -eddie

    ==> dynaWDL.wdl <==
    task dynaTask {
        String dynaCmd
        command <<<
            ${dynaCmd}
            >>>
    
        runtime
            {
            docker : "ubuntu" 
            }
        }
    
    workflow  dynaWorkflow {
        String dynaCmd
        call dynaTask {
            input:
                dynaCmd=dynaCmd
            }
        }
    
  • esalinasesalinas BroadMember, Broadie ✭✭✭

    I want to mention to any interested party such a Chet Birger or Jason Neff who may monitor these boards that I did try to upload the WDL file instead of pasting it inline as I've done, but I tried to do so and the forum system would not let me do so.

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    When I tried to upload "dynaWDL" I got a message : "(dynaWDL.wdl) Uploaded file type is not allowed."

  • daveMdaveM BostonMember

    Thanks, Eddie. I'm very new to this, so am going to have to work on it for a while to figure out whether your example helps. In the meantime I can add my wdl and input json file. It works fine in Cromwell using the json file, I just don't know how to provide the input on the FireCloud platform (using a method config). I can't attach a file to the forum post, so I'm adding inline below. My docker image is a plain CentOs image with an executable called "bam" in a location on $PATH. (I still need to add the image to dockerhub - haven't gotten that far yet.)
    --------------------wdl-------------------
    task bamUtilTask {
    File input_bam
    String tool
    String param
    String output_bam

    command <<<
        bam ${tool} ${input_bam} ${output_bam} ${param}
        >>>
    output {
        File out = "${output_bam}"
    }
    runtime {
        docker: "bam-util:latest"
    }
    

    }
    workflow SomaticVariantCaller {
    call bamUtilTask
    }
    ----------------------json---------------------------
    {
    "SomaticVariantCaller.bamUtilTask.input_bam": "HCC1143-21.bam",
    "SomaticVariantCaller.bamUtilTask.tool": "trimBam",
    "SomaticVariantCaller.bamUtilTask.param": "-L 30",
    "SomaticVariantCaller.bamUtilTask.output_bam": "HCC1143-21.trim.bam"
    }

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    Hi Dave,

    You've provided some helpful additional information.

    You must push the docker image to dockerhub before attempting to run it in firecloud. If it's not in dockerhub, then firecloud can't pull it to run the command in it. Furthermore, if the image is set as "Private" there, then "firecloud" must be added as a "collaborator" for it to be readble/pullable by firecloud

    -eddie

  • daveMdaveM BostonMember

    I haven't tried to run it yet - just trying to get things set up. But sure, I'll go ahead and push it.

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    I would recommend to give the full path of the "bam" executable.

    Also, you wrote

    Ideally, I'd like to set the value of the "tool" string here in the method configuration. But when I try to do that I get an error message.
    

    But you did not say what the error message is.

    Your WDL seems valid however.

    -eddie

  • daveMdaveM BostonMember

    Yes, the wdl works in Cromwell. Sorry, I didn't include the error message earlier. Below, I've cut and paste from the Method Configuration page and you can see the error. When I first import the method to my workspace, the default Method Configuration has an error message on each line saying: Failed at line 1, column 1: string matching regex ^\".*\"$' expected but'' found. I added "bam" to the line for input_bam based on this understanding: The root entity for the method is a sample. I loaded a data file for samples with an attribute (column) called "bam" I put the google-bucket url for the bam file in the bam column of the sample file. But how to designate that the tool should be "trim"? Does that have to go in the sample file too?

    This seems like a very formal way to do it - am I over-complicating it?

    Root Entity Type
    sample
    Inputs
    SomaticVariantCaller.bamUtilTask.input_bam: (File)this.bam
    SomaticVariantCaller.bamUtilTask.tool: (String)'trimBam'
    Failed at line 1, column 1: string matching regex ^\".*\"$' expected but'' found
    SomaticVariantCaller.bamUtilTask.param: (String)expression
    Failed at line 1, column 1: string matching regex ^\".*\"$' expected bute' found
    SomaticVariantCaller.bamUtilTask.output_bam: (String)expression
    Failed at line 1, column 1: string matching regex ^\".*\"$' expected bute' found

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    You can also have inputs via the data entity model like you do for the bam which seems fine to me. For the tool you could also use the data entity model, but it might be better to have that as either an immediate value or a workspace annotation key/value.

  • daveMdaveM BostonMember

    OK! That was the missing piece. I can put the string literal into the Method Config, but it should be in double quotes. I tried without quotes, and with single quotes, but I hadn't tried double. That eliminates the "Failed at line 1, column 1: string matching regex ^\".*\"$' expected bute' found" error.

    I'll have more issues to get through, but I will put them in a separate thread.

    Thanks!!

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    You're welcome!

Sign In or Register to comment.