How to Initiate Scatter Gather on One Machine
The HaplotypeCaller documentation recommends using Queue to parallelize HaplotypeCaller instead of -nct, so I've been attempting to do that, however I can't seem to get Queue to do any kind of parallel processing. I'm currently working on a machine with 8 cores and I'm consistently getting Queue to run, but it only runs single-threaded. I don't have access to a distributed computing environment, but I don't see why Queue wouldn't be able to parallelize on one machine with multiple cores, and I see no documentation indicating that threading by Queue is only available in distributed computing environments.
What I've done is a minimal modification of the ExampleUnifiedGenotyper.scala script to use it to run HaplotypeCaller. I have tried running it a couple of times to see how it would run. I tried a couple times with just the reference file and mapping file as input, plus I tried a couple times with an intervals file listing each of the chromosomes as separate intervals. Every time, it ran single threaded.
I've found several articles and comments indicating that Queue should be used to Scatter/Gather a job and even explain how Scatter/Gather works, so I was under the assumption that this is just what Queue does and it would use multi-core systems to their full advantage, however this is not my experience and I don't see anything in the documentation to explain why. If it could be explained to me either how I'm running the command wrong, or why Queue can't be used to parallelize on one machine, I would be very grateful.