Forum Login Issue:
Currently the "Log in with Google" button redirects you to a "Page not found." This is an issue that our forum vendors are working on fixing. In the meantime, while on the "Page not found" you can edit the URL to delete the second gatk, firecloud, or wdl (depending on what subforum you are acessing).
ex: https://gatkforums.broadinstitute.org/gatk/gatk/entry/...

Automation beyond cromwell

Background: I work in food safety, and we type our isolates using MLST, and then perform a SNP analysis within each sequence type to determine whether we have related isolates. For the SNPs analysis, we also include restrospective isolates with that Sequence Type we found previously.

I am looking for a way to automate this process. Right now, it's quite a lot of work to extract all isolates with a certain Sequence Type, get the appropriate reference, run the wdltool inputs, make sure all settings are correct, and run cromwell. And then once we get new data, we have to do it all over again with one (or more) additional sample(s). Of course call caching helps with the computational time, but not with the 'manual' time we have to spent setting up each analysis.

Are there any tools available that can help queue up cromwell runs for each MLST type? I imagine re-running a data set with one extra sample is something that is pretty common.

Best Answer

Answers

  • Hi @kshakir,

    Thanks for your reply. I ended up creating a custom program that keeps track of all isolates and their MLST type, and fires off a new cromwell analysis if it encounters a new isolate of a certain type. That way, cromwell keeps track of the jobs that have already been run, as you suggested.

Sign In or Register to comment.