Scatter/Gather utilizing multiple cores in the same node/job

jesseerdmannjesseerdmann Posts: 2Member
edited April 2013 in Ask the GATK team

At the Minnesota Supercomputing Institute, our environment requires that jobs on our high performance clusters reserve an entire node. I have implemented my own Torque Manager/Runner for our environment based on the Grid Engine Manager/Runner. The way I have gotten this to work in our environment is to set the nCoresRequest for the scatter/gather method to the minimum required of eight. My understanding is that for the InDelRealigner, for example, the job reserves a node with eight cores, but only uses one. That means our users would have their compute time allocation consumed eight times faster than is necessary.

What I am wondering is are there options that I am missing where some number of the scatter/gather requests can be grouped into a single job submission? If I were writing this as a PBS script for our environment and I wanted to use 16 cores in a scatter/gather implementation, I would write two jobs, each with eight commands. They would look something like the following:

#PBS Job Configuration stuff
pbsdsh -n 0 java -jar ... &
pbsdsh -n 1 java -jar ... &
pbsdsh -n 2 java -jar ... &
pbsdsh -n 3 java -jar ... &
pbsdsh -n 4 java -jar ... &
pbsdsh -n 5 java -jar ... &
pbsdsh -n 6 java -jar ... &
pbsdsh -n 7 java -jar ... &
wait

Has anyone done something similar in Queue? Any pointers? Thanks in advance!

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,467Administrator, GATK Developer admin

    Hi there,

    Unfortunately, for tools like IndelRealigner that cannot be multi-threaded, there is nothing we can do for you, since the nCoresRequest parameter is a property of your cluster software. There is no built-in option to do what you want within Queue. Perhaps other users may suggest a workaround...

    Geraldine Van der Auwera, PhD

  • ryanabashbashryanabashbash Texas A&M UniversityPosts: 9Member

    I seem to have run into something similar when trying to run a QScript on a cluster. When writing the QScript, I tested it on a single machine that was running SGE, and it happily ran many Scatter/Gather jobs on the single node in parallel. However, once I moved over to the cluster (running UGE), it's happy to run multiple BWA jobs from the QScript on a single node, but it only launches one GATK job (e.g. IndelRealigner) on a single node.

    Here is the class I use for BWA alignment that has no problem launching multiple instances on a single node in the cluster.

    class BWAalignment(BWApath: File, BWAnumThreads: Int, readGroup: String, BWAindex: File, input: File, output: File) extends CommandLineFunction {
       @Input(doc="Input FASTQ")
       var inputFastq: File = input
       @Output(doc="Output SAM")
       var outputSAM: File = output
       this.jobName = "BWA_alignment"
       this.analysisName = "BWA_alignment"
       this.nCoresRequest = BWAnumThreads
       this.residentRequest = 6000
       this.jobEnvironmentNames = List("mpi " + BWAnumThreads)  //get smp_pe not valid errors without this.
    
       def commandLine = {
          BWApath + " mem " + " -t " + BWAnumThreads + " -R \"" + readGroup + "\" " + BWAindex + " " + inputFastq + " > " + outputSAM
       }
    }
    

    This is what I use a little later on in the script to define the IndelRealigner, but only one job ever launches on a single node.

    var indelRealignment = new IndelRealigner
    indelRealignment.reference_sequence = referenceFile
    indelRealignment.input_file = List(PicardOutput)
    indelRealignment.targetIntervals = targetCreation.out
    indelRealignment.out = swapExt(PicardOutput, "bam", "realigned.bam")
    indelRealignment.scatterCount = GATKnumScatter
    indelRealignment.jobEnvironmentNames = List("mpi " + "1")  //get smp_pe not valid errors without this.
    

    Given that the CommandLineFunction is able to launch multiple instances on a single node (e.g. 4 instances of BWA, each consuming 4 processors on a 16 processor node), is there a way to launch multiple IndelRealigners on a single node (e.g. 16 instances of IndelRealigner on a 16 processor node)? It seems I'm fundamentally misunderstanding something since things worked fine on the single machine running SGE.

Sign In or Register to comment.