Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on October 14, 2019, due to the U.S. holiday. We will return to monitoring the forum on October 15.

Problem running GATK workflow (locally) from your tutorial

dodauspdodausp DenmarkMember
edited August 1 in Ask the GATK team

Hi, there
I am familiar with GATK best practices, but only now have I started actually running your codes, alongside learning command lines in Unix/Linux. Hence, I am sorry for the lay question: how can I execute your tutorial on running GATK workflows from the git repository?
I have installed and run the docker, which seems to be working fine (by following your guides on setting it up). Now, when I try to execute it by

java -jar cromwell-33.1.jar run ./seq-format-validation/validate-bam.wdl --inputs ./seq-format-validation/validate-bam.inputs.json

I get some warning messages and one error message, as follows:

_[**warn**] SingleWorkflowRunnerActor: received unexpected message: Done in state RunningSwraData
[**warn**] Couldn't find a suitable DSN, defaulting to a Noop one.
[**warn**] Local [ae4b78d6]: Key/s [memory, cpu, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[**error**] WorkflowManagerActor Workflow ae4b78d6-cf4c-4896-90aa-a88c1b2d2c87 failed (during ExecutingWorkflowState): cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor$$anonfun$1$$anon$1: Call input and runtime attributes evaluation failed for ValidateBAM:
java.nio.file.NoSuchFileException: /Desktop/gatk_example_github/inputs/wgs_bam_NA12878_24RG_hg38_NA12878_24RG_small.hg38.bam
    at cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor$$anonfun$1.applyOrElse(JobPreparationActor.scala:65)_.

I tried troubleshooting it by reading some of the posts on the forum (i.e. Q1 and Q2), but I am still struggling to identify and fix the issue.
My strongest guess is that I have to set the the arguments for ValidateBamsWf.ValidateBAM.gatk_path_override and ValidateBamsWf.gatk_docker_override, but what should I add there?

Any help is greatly appreciated.

Thanks!

PS: (1) I have mounted the volume with the folders, sub-folders and files onto my docker broadinstitute/gatk:4.1.2.0, onto the preset folder my_data;
(2) I have tried to run the workflow from the docker as well (/gatk)

Post edited by bshifaw on

Best Answer

  • bshifawbshifaw admin admin
    edited August 9 Accepted Answer

    There was an extra comma after "Summary" in the input json. I removed it from my previous comment, try again and remove the "," after "Summary" in your input.json.

    Also just to check, please cd into the directory holding your input bam and enter pwd to show the parent working directory and post the results.

Answers

  • bshifawbshifaw admin Member, Broadie, Moderator admin

    Hi @dodausp
    Mind posting the contents of your json file?
    Also can you confirm that /Desktop/gatk_example_github/inputs/wgs_bam_NA12878_24RG_hg38_NA12878_24RG_small.hg38.bam exists and the path is correct?
    What operating system are you running this on?

  • dodauspdodausp DenmarkMember
    edited August 5

    Hi @bshifaw
    Thanks for the feedback. As you can see, the file and path do exist:

    ~$ ls -l Desktop/gatk_example_github/inputs/
    total 4885096
    -rw-rw-r-- 1 doc doc 5002333638 Jul 30 14:50 wgs_bam_NA12878_24RG_hg38_NA12878_24RG_small.hg38.bam

    And here is the content of the json file:

    {
      "##Comment1":"Input",
      "ValidateBamsWf.bam_array": [
            "/Desktop/gatk_example_github/inputs/wgs_bam_NA12878_24RG_hg38_NA12878_24RG_small.hg38.bam"],
      "##Comment2":"Parameter",
      "ValidateBamsWf.ValidateBAM.validation_mode": "SUMMARY",
    
      "##Comment3":"Runtime - uncomment the lines below and supply a valid docker container to override the default",
      "ValidateBamsWf.ValidateBAM.machine_mem_gb": "1 GB",
      "ValidateBamsWf.ValidateBAM.disk_space_gb": "100",
      "##ValidateBamsWf.ValidateBAM.gatk_path_override": "String (optional)",
      "##ValidateBamsWf.gatk_docker_override": "String (optional)"
    }
    

    Thanks again.

    Post edited by bshifaw on
  • bshifawbshifaw admin Member, Broadie, Moderator admin
    edited August 9

    That's strange, The error that you are getting java.nio.file.NoSuchFileException: is cromwell attempting to stream in data from the cloud, which shouldn't be the case since you are providing a a local directory path and the executed command doesn't include a configuration file pointing to the cloud.

    My strongest guess is that I have to set the the arguments for ValidateBamsWf.ValidateBAM.gatk_path_override and ValidateBamsWf.gatk_docker_override, but what should I add there?

    You shouldn't need to worry about this unless you want to use a different gatk docker version. e.g. broadinstitute/gatk:4.1.1.0

    I just noticed the PS statements. You shouldn't be spinning up a docker container to execute the cromwell command, Cromwell will automatically do this for the workflow. Also it wouldn't be possible because cromwell needs to spin up dockers to run this workflow and running a docker within a docker is not possible.

    Try working with the barebones of the json file, you should only need the following

    {
      "ValidateBamsWf.bam_array": ["/Desktop/gatk_example_github/inputs/wgs_bam_NA12878_24RG_hg38_NA12878_24RG_small.hg38.bam"],
      "ValidateBamsWf.ValidateBAM.validation_mode": "SUMMARY"
    }
    

    If you get an error, put the entire contents of the terminal messages and error into a file and post it to this forum thread. Also attach stderr and stdlog if available

    Post edited by bshifaw on
  • dodauspdodausp DenmarkMember

    Thank you again, @bshifaw
    I have tried with your clean suggestion, and it has not worked. I am pretty lost.

    Please, see enclosed the full error message. In all the file contains:
    1. checking the directory structure;
    2. showing the content of "validate-bam.inputs.json" file; and
    3. the error message

    Thank you a lot for the help!

  • bshifawbshifaw admin Member, Broadie, Moderator admin
    edited August 9 Accepted Answer

    There was an extra comma after "Summary" in the input json. I removed it from my previous comment, try again and remove the "," after "Summary" in your input.json.

    Also just to check, please cd into the directory holding your input bam and enter pwd to show the parent working directory and post the results.

  • dodauspdodausp DenmarkMember

    @bshifaw said:
    There was an extra comma after "Summary" in the input json. I removed it from my previous comment, try again and remove the "," after "Summary" in your input.json.

    Also just to check, please cd into the directory holding your input bam and enter pwd to show the parent working directory and post the results.

    Many thanks, @bshifaw!
    The problem seemed to lie on the extra comma.
    I run the test code now, and it is outputting something, although still getting those warning messages.

    As a follow up, I am posting another question regarding getting started with the workflows. More specifically, I am trying to use your "gatk4-cnn-variant-filter" workflow by using BAM and BAI files as input.
    It will come shortly.
    And thanks again for the help!

Sign In or Register to comment.