Should I have a such huge size of cromwell-executions folder?

Dear Cromwell Team,

I was runing gatk4 best practice in my cPouta VM with docker (24Vcpu and 3T storage), (gatk4-data-preprocessing and germline-snps-indels) with NA12878_24_RG_small sample data.
After workflow finished, I found that my inputs folder size is just about 40Gib, while the size of my cromwell-executions folder is about 500 Gib. That is really huge and also when I tried to use normal NA12878 sample to run the workflow, it reported no space on device.

Is it the right situation (get such a huge executions folder) or I did something wrong? I checked the executions folder, basically, it copied inputs for each task call, I think it is the main reason for getting such a huge size.

I did noticed many [warn]: Localization via hard link has failed. Is it the reason why cromwell copied inputs every time for every task call? How could I fix it?

Best Answer


  • Sorry for a little clearing: Inside of the execution folder, I think it didn´t copy inputs from orignal inputs folder but it did created input folder and generated all inputs needed for every task inside of its input folder. Is it normal? or I l forget to define some attribute or parameter? In this way, cormwell generate a huge size of executions folder with all these middle file.

