Recursive folders creation when running "Data Pre-Processing" workflow

Hello,

I tried to run locally the Data Pre-Processing workflow found in GTAK4 Best Practices (both wdl and json files were downloaded from Github/gatk-workflows/gatk4-data-processing) but it encountered an error during the "SamToFastqAndBwaMem" task.

Here is a part of the log file indicating the error:

[2018-10-12 11:00:10,28] [warn] Localization via hard link has failed: /ngs/projects/3_GATK4/test_local_data-processing/cromwell-executions/PreProcessingForVariantDiscov
ery_GATK4/f2c86fe5-752b-469b-b891-2dbd5a63852a/call-SamToFastqAndBwaMem/shard-0/inputs/1214441233/hg19.fa.amb -> /ngs/references/genomes/hg19/index/bwa-0.7.7_picard-tool
s-1.110_Samtools-0.1.19/hg19.fa.amb: Invalid cross-device link
[2018-10-12 11:00:10,28] [warn] Localization via hard link has failed: /ngs/projects/3_GATK4/test_local_data-processing/cromwell-executions/PreProcessingForVariantDiscov
ery_GATK4/f2c86fe5-752b-469b-b891-2dbd5a63852a/call-SamToFastqAndBwaMem/shard-1/inputs/1214441233/hg19.fa.amb -> /ngs/references/genomes/hg19/index/bwa-0.7.7_picard-tool
s-1.110_Samtools-0.1.19/hg19.fa.amb: Invalid cross-device link
[2018-10-12 11:00:10,29] [warn] Localization via hard link has failed: /ngs/projects/3_GATK4/test_local_data-processing/cromwell-executions/PreProcessingForVariantDiscov
ery_GATK4/f2c86fe5-752b-469b-b891-2dbd5a63852a/call-SamToFastqAndBwaMem/shard-0/inputs/1153145894/test_local_data-processing -> /ngs/projects/3_GATK4/test_local_data-pro
cessing: Operation not permitted
[2018-10-12 11:00:10,29] [warn] Localization via hard link has failed: /ngs/projects/3_GATK4/test_local_data-processing/cromwell-executions/PreProcessingForVariantDiscov
ery_GATK4/f2c86fe5-752b-469b-b891-2dbd5a63852a/call-SamToFastqAndBwaMem/shard-1/inputs/1153145894/test_local_data-processing -> /ngs/projects/3_GATK4/test_local_data-pro
cessing: Operation not permitted
[2018-10-12 11:00:15,33] [warn] Localization via copy has failed: /ngs/projects/3_GATK4/test_local_data-processing/cromwell-executions/PreProcessingForVariantDiscovery_G
ATK4/f2c86fe5-752b-469b-b891-2dbd5a63852a/call-SamToFastqAndBwaMem/shard-1/inputs/1153145894/test_local_data-processing.tmp/run/cromwell-executions/PreProcessingForVaria
ntDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GA
TK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-8
3f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-
8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/c
all-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastq
AndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shar
d-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/192
2373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tm
p/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-exe
cutions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PrePro
cessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVari
antDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_G
ATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-
83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871
-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/
call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFast
qAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/sha
rd-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598/run.tmp/cromwell-executions/PreProcessingForVariantDiscovery_GATK4/d0914f33-83f1-4596-a871-8ea8555087a9/call-SamToFastqAndBwaMem/shard-0/inputs/1922373598: File name too long

So it appears that Cromwell is recursively creating folders until the path become too long...
It's really strange and I have no real clue about what can cause that behavior.

I first thought it might be related to file permissions so I made sure every files were in read access mode.
Then, following an answer found in an other topic, I also tried to change the cromwell preferences about links using this configurarion:

backend {
  default="Local"
  providers {
    Local {
      config {
        filesystems {
          local {
            localization: [
              "soft-link", "copy", "hard-link"
            ]
            caching {
              duplication-strategy: [
                "soft-link", "copy", "hard-link"
              ]
            }
          }
        }
      }
    }
  }
}

None of all my tries have solved the problem so I would be very pleased if you could help me ! :)

For information:
-I am running Cromwell 33.1 on a CentOS7 machine.
-The Wdl file have been keep unchanged except calls to docker that have been removed as it's not installed and we cannot access internet from our server.
-All references files are given in the JSON file with absolute paths.

Thank you in advance !

Answers

  • hkewardhkeward Member

    It looks like you've specified a directory as an input to the workflow (/ngs/projects/3_GATK4/test_local_data-processing).

    The problem stems from the fact that that directory also happens to be the location that you're executing the workflow from. When Cromwell tries to localize your input files, it sees the directory you specified and tries to copy the files it contains into cromwell-executions. However, cromwell-executions itself happens to be in that directory since that's where you started the workflow, so it ends up recursively trying to copy files until the file name becomes too long.

    Just try running the workflow from a different directory and it should work.

Sign In or Register to comment.