Forum Login Issue:
Currently the "Log in with Google" button redirects you to a "Page not found." This is an issue that our forum vendors are working on fixing. In the meantime, while on the "Page not found" you can edit the URL to delete the second gatk, firecloud, or wdl (depending on what subforum you are acessing).
ex: https://gatkforums.broadinstitute.org/gatk/gatk/entry/...

Problem in a task with piped command

Dear all in the WDL team,
I am trying to put a workflow of three tasks as recommended in the GATK best practice (to start from fastq files till the aligned bam file). I tried to execute the steps in the command line separately before and they worked, but when I try to combine them into one WDL script I keep getting an error message because of the last step (the piped one) and I guess it is because I don't know how to use the /dev/stdout and /dev/stdin.
The error message is: "Could not process output, file not found: /home/projects/cu_10111/data/Test/TOY_piped.bam
java.lang.RuntimeException: Could not process output, file not found: /home/projects/cu_10111/data/Test/TOY_piped.bam"
The strange thing is that this missing file (TOY_piped.bam) should be the last output and I don't understand why it should be existed before. I tried to add an option file to solve this issue but it is not working.
The WDL script, the inputs.json and the option.json are the following (respectively):
script.wdl
workflow FromFastqToVCF {
File FASTQ1
File FASTQ2
String SAMPLENAME
File REFFASTA
File REFINDEX
File REFDICT
call FastqToSam {
input:
FastqR1=FASTQ1,
FastqR2=FASTQ2,
SampleName=SAMPLENAME
}
call MarkIlluminaAdapters {
input:
SampleName=SAMPLENAME,
uBAM=FastqToSam.uBAM
}
call AllignedBAM {
input:
mBAM=MarkIlluminaAdapters.mBAM,
refFasta=REFFASTA,
uBAM=FastqToSam.uBAM,
SampleName=SAMPLENAME
}
}

task FastqToSam {
File FastqR1
File FastqR2
String SampleName
command {
gatk FastqToSam \
--FASTQ "${FastqR1}" \
--FASTQ2 "${FastqR2}" \
--OUTPUT "/home/projects/cu_10111/data/Test/${SampleName}_fastqtosam.bam" \
--SAMPLE_NAME "${SampleName}"
}
output {
File uBAM = "/home/projects/cu_10111/data/Test/${SampleName}_fastqtosam.bam"
}

}

task MarkIlluminaAdapters {
File uBAM
String SampleName
command {
gatk MarkIlluminaAdapters \
--INPUT "${uBAM}" \
--METRICS "/home/projects/cu_10111/data/Test/${SampleName}_markilluminaadapters_metrics.txt" \
--OUTPUT "/home/projects/cu_10111/data/Test/${SampleName}_markilluminaadapters.bam"
}
output {
File mBAM = "/home/projects/cu_10111/data/Test/${SampleName}_markilluminaadapters.bam"
}

}

task AllignedBAM {
File mBAM
String SampleName
File refFasta
File refIndex
File refDict
File uBAM
command {
set -o pipefail
gatk SamToFastq \
--INPUT ${mBAM} \
--FASTQ /dev/stdout \
--CLIPPING_ATTRIBUTE XT --CLIPPING_ACTION 2 --INTERLEAVE true --INCLUDE_NON_PF_READS true \
--TMP_DIR /home/projects/cu_10111/data/Test/temp
| \
bwa mem -M -t 31 -p ${refFasta} /dev/stdin \
| \
gatk MergeBamAlignment \
--REFERENCE_SEQUENCE ${refFasta} \
--UNMAPPED_BAM ${uBAM} \
--ALIGNED_BAM /dev/stdin \
--CREATE_INDEX true --ADD_MATE_CIGAR true --CLIP_ADAPTERS false --CLIP_OVERLAPPING_READS true \
--INCLUDE_SECONDARY_ALIGNMENTS true --MAX_INSERTIONS_OR_DELETIONS -1 --PRIMARY_ALIGNMENT_STRATEGY MostDistant \
--ATTRIBUTES_TO_RETAIN XS \
--OUTPUT /home/projects/cu_10111/data/Test/${SampleName}_piped.bam
--TMP_DIR /home/projects/cu_10111/data/Test/temp
}
output {
File BAM = "/home/projects/cu_10111/data/Test/${SampleName}_piped.bam"
}
}

inputs.json
{
"FromFastqToVCF.AllignedBAM.refDict": "/home/projects/cu_10111/data/Test/Homo_sapiens_assembly19.dict",
"FromFastqToVCF.AllignedBAM.refIndex": "/home/projects/cu_10111/data/Test/Homo_sapiens_assembly19.fasta.fai",
"FromFastqToVCF.SAMPLENAME": "TOY",
"FromFastqToVCF.REFINDEX": "/home/projects/cu_10111/data/Test/Homo_sapiens_assembly19.fasta.fai",
"FromFastqToVCF.FASTQ2": "/home/projects/cu_10111/data/Test/TOY_S1_L001_R2_001.fastq.gz",
"FromFastqToVCF.REFDICT": "/home/projects/cu_10111/data/Test/Homo_sapiens_assembly19.dict",
"FromFastqToVCF.FASTQ1": "/home/projects/cu_10111/data/Test/TOY_S1_L001_R1_001.fastq.gz",
"FromFastqToVCF.REFFASTA": "/home/projects/cu_10111/data/Test/Homo_sapiens_assembly19.fasta"
}

options.json
{
"default_runtime_attributes": {
"continueOnReturnCode": true
},
"workflow_failure_mode": "ContinueWhilePossible",
"write_to_cache": true,
"read_from_cache": true
}

I will appreciate any help specially that I am not a linux person and this is all knew to me.
Best
Nawar

Best Answer

Answers

  • Thank you Chris. What you said was definitely part of the problem. The other part is the bwa tool in WDL. I realized that I should have declared all the relevant files for the indexing (5 files) which one doesn't use usually per say.

Sign In or Register to comment.