We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.
Overzealous call caching
For our analysis, we use a input file with all our samples of format
samplename /path/to/forward_reads.fastq.gz /path/to/reverse_reads.fastq.gz. I wanted to add support for spaces in filenames, which does not work currently. So I made a folder with a space in the name and hardlinked the fastq files there.
mkdir path\ with\ space cd path\ with\ space ln /path/to/*.fastq.gz .
I then updated the input file to include
samplename /path with space/forward_reads.fastq.gz /path with space/reverse_reads.fastq.gz and ran the analysis with cromwell on this file.
To my surprise, the analysis completed successfully. However, when I look at the
cromwell-execution folder, there are not
inputs folders for any of the tasks, and when I look in the
script file I see the inputs refer to
/path/to/forward_reads.fastq.gz, instead of
/path with space/forward_reads.fastq.gz.
Is this the expected behaviour? I would expect cromwell to notice the sample input file has changed, which should invalidate all cached calls that inherit from that file. How can I make sure that cromwell doesn't re-use the results from an earlier version when I change the input file?