It looks like you're new here. If you want to get involved, click one of these buttons!
At the Minnesota Supercomputing Institute, our environment requires that jobs on our high performance clusters reserve an entire node. I have implemented my own Torque Manager/Runner for our environment based on the Grid Engine Manager/Runner. The way I have gotten this to work in our environment is to set the nCoresRequest for the scatter/gather method to the minimum required of eight. My understanding is that for the InDelRealigner, for example, the job reserves a node with eight cores, but only uses one. That means our users would have their compute time allocation consumed eight times faster than is necessary.
What I am wondering is are there options that I am missing where some number of the scatter/gather requests can be grouped into a single job submission? If I were writing this as a PBS script for our environment and I wanted to use 16 cores in a scatter/gather implementation, I would write two jobs, each with eight commands. They would look something like the following:
#PBS Job Configuration stuff pbsdsh -n 0 java -jar ... & pbsdsh -n 1 java -jar ... & pbsdsh -n 2 java -jar ... & pbsdsh -n 3 java -jar ... & pbsdsh -n 4 java -jar ... & pbsdsh -n 5 java -jar ... & pbsdsh -n 6 java -jar ... & pbsdsh -n 7 java -jar ... & wait
Has anyone done something similar in Queue? Any pointers? Thanks in advance!