We've moved!
For WDL questions, see the WDL specification and WDL docs.
For Cromwell questions, see the Cromwell docs and please post any issues on Github.

Parameters in workflow input json not being passed to workflow WDL?

amr@broadinstitute.orge[email protected] Member, Broadie

Hi,

I'm trying to get a gatk WDL put together and can't figure out what I'm doing wrong here. Here's the error:

  "failures": [
    {
      "causedBy": [
       {
          "causedBy": [],
          "message": "Required workflow input gatk.samples_file not specified."
       },
        {
          "causedBy": [],
          "message": "Required workflow input gatk.reference not specified."
        }
      ],

here is my workflow input:

{
  "samples_file": "/cil/shed/sandboxes/amr/dev/gatk_pipeline/data/small/input.tsv",
  "picard_path": "/cil/shed/apps/external/picard/current/bin/picard.jar",
  "reference": "/cil/shed/sandboxes/amr/dev/gatk_pipeline/data/small/CneoH99_supercont2.1_200k_300k.fa",
  "output_dir": "/cil/shed/sandboxes/amr/dev/gatk_pipeline/output/small_test",
  "overwrite": "true"
}

and here is my partial WDL:

# GATK WDL V0
# Subset 1

task IndexReference {
    File reference # from JSON
    String picard_path # from JSON
    String out_file

    command {
    use BWA
    bwa index ${reference}
    java -jar ${picard_path} CreateSequenceDictionary REFERENCE=${reference} O=${out_file}
    use Samtools
    samtools faidx ${reference}
    }
}

task MakeDirs {
    String output_dir # from JSON
    String sample_name # from csv

    command {
        mkdir -p ${output_dir}/${sample_name}
    }
    output {
        String sample_dir = "${output_dir}/${sample_name}/"
    }
}

task SamToFastq {
    String picard_path # from JSON
    String in_bam # from TSV
    String fq1 = sub(in_bam, "\\.bam$", ".1.fastq") # first end fastq
    String fq2 = sub(in_bam, "\\.bam$", ".2.fastq") # second end fastq

    command {
        java -Xmx12G -jar ${picard_path} SamToFastq INPUT=${in_bam} FASTQ=${fq1} SECOND_END_FASTQ=${fq2}
    }
    output {
        Array[String] fq_set = ["${fq1}", "${fq2}"]
    }
}

task AlignBAM {
    String reference
    String sample_dir
    String sample_name

    command {
        use BWA
        bwa mem -t 8 ${reference} SamtToFastq.fq_set[0] SamToFastq.fq_set[1] > ${sample_dir}${sample_name}.sam # removed read group, can add back later
    }
}

workflow gatk {
    File samples_file
    File reference

    call IndexReference {
        input: reference=reference,
        picard_path=picard_path,
        out_file = sub(reference, "\\.fa*$", ".dict")
    }
    # This scatter block
    scatter(sample in read_tsv(samples_file)) {

        call MakeDirs {
            input: output_dir = output_dir,
            sample_name = sample[0]
        }
        call SamToFastq {
            input: picard_path = picard_path,
            in_bam = sample[1],
        }

        call AlignBAM {
            input: reference = reference,
            sample_dir = MakeDirs.sample_dir,
            sample_name = sample[0]
        }
    }
}

I thought using the variable 'samples_file' as written in my input JSON, WDL would know to use the variable as specified in the input JSON but that seems not to be the case. I thought that having input: in the calls would indicate that WDL should get the input from the json but I'm missing something here.

Best Answer

Answers

Sign In or Register to comment.