Scatter/Gather utilizing multiple cores in the same node/job
At the Minnesota Supercomputing Institute, our environment requires that jobs on our high performance clusters reserve an entire node. I have implemented my own Torque Manager/Runner for our environment based on the Grid Engine Manager/Runner. The way I have gotten this to work in our environment is to set the nCoresRequest for the scatter/gather method to the minimum required of eight. My understanding is that for the InDelRealigner, for example, the job reserves a node with eight cores, but only uses one. That means our users would have their compute time allocation consumed eight times faster than is necessary.
What I am wondering is are there options that I am missing where some number of the scatter/gather requests can be grouped into a single job submission? If I were writing this as a PBS script for our environment and I wanted to use 16 cores in a scatter/gather implementation, I would write two jobs, each with eight commands. They would look something like the following:
#PBS Job Configuration stuff pbsdsh -n 0 java -jar ... & pbsdsh -n 1 java -jar ... & pbsdsh -n 2 java -jar ... & pbsdsh -n 3 java -jar ... & pbsdsh -n 4 java -jar ... & pbsdsh -n 5 java -jar ... & pbsdsh -n 6 java -jar ... & pbsdsh -n 7 java -jar ... & wait
Has anyone done something similar in Queue? Any pointers? Thanks in advance!