If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Creating a ParallelShellJobRunner for Queue
I'm working on trying to get a parallel version of the
ShellJobRunner working in Queue, which would allow us to parallelize some parts of our workflows that are running single core on a full node using the ShellJobRunner and thus are wasting a lot of resources. I thought that I'd made some rather nice progress, until I noticed that if I tried to use it for any job running longer than about 5 minutes the job runner would exit saying that it's job failed, while in reality the job keeps running (so it obviously it did not fail, and Queue doesn't kill it either).
The code I've come up with so far is available here: https://gist.github.com/johandahlberg/a9b7ac61c3aa2c654899 (And as you can see it's mostly stolen from the regular ShellJobRunner, which with some Scala future stuff mixed in)
I'm guessing that the problems comes from me abusing the ProcessController (and admittedly there are warnings in the source for it for not being thread safe), but I'm not sure if there is any way that I can get around it. Any pointers here would be extremely appreciated - also if there is any general interest in this feature I'd be happy to clean up the code a bit and submit a pull request on this upstream.