The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Formatting tip!

Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Queue with Grid Engine

Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin
edited February 2014 in Pipelining with Queue

1. Background

Thanks to contributions from the community, Queue contains a job runner compatible with Grid Engine 6.2u5.

As of July 2011 this is the currently known list of forked distributions of Sun's Grid Engine 6.2u5. As long as they are JDRMAA 1.0 source compatible with Grid Engine 6.2u5, the compiled Queue code should run against each of these distributions. However we have yet to receive confirmation that Queue works on any of these setups.

Our internal QScript integration tests run the same tests on both LSF 7.0.6 and a Grid Engine 6.2u5 cluster setup on older software released by Sun.

If you run into trouble, please let us know. If you would like to contribute additions or bug fixes please create a fork in our github repo where we can review and pull in the patch.

2. Running Queue with GridEngine

Try out the Hello World example with -jobRunner GridEngine.

java -jar dist/Queue.jar -S public/scala/qscript/examples/HelloWorld.scala -jobRunner GridEngine -run

If all goes well Queue should dispatch the job to Grid Engine and wait until the status returns RunningStatus.DONE and "hello world should be echoed into the output file, possibly with other grid engine log messages.

See QFunction and Command Line Options for more info on Queue options.

3. Debugging issues with Queue and GridEngine

If you run into an error with Queue submitting jobs to GridEngine, first try submitting the HelloWorld example with -memLimit 2:

java -jar dist/Queue.jar -S public/scala/qscript/examples/HelloWorld.scala -jobRunner GridEngine -run -memLimit 2

Then try the following GridEngine qsub commands. They are based on what Queue submits via the API when running the HelloWorld.scala example with and without memory reservations and limits:

qsub -w e -V -b y -N echo_hello_world \
  -o test.out -wd $PWD -j y echo hello world

qsub -w e -V -b y -N echo_hello_world \
  -o test.out -wd $PWD -j y \
  -l mem_free=2048M -l h_rss=2458M echo hello world

One other thing to check is if there is a memory limit on your cluster. For example try submitting jobs with up to 16G.

qsub -w e -V -b y -N echo_hello_world \
  -o test.out -wd $PWD -j y \
  -l mem_free=4096M -l h_rss=4915M echo hello world

qsub -w e -V -b y -N echo_hello_world \
  -o test.out -wd $PWD -j y \
  -l mem_free=8192M -l h_rss=9830M echo hello world

qsub -w e -V -b y -N echo_hello_world \
  -o test.out -wd $PWD -j y \
  -l mem_free=16384M -l h_rss=19960M echo hello world

If the above tests pass and GridEngine will still not dispatch jobs submitted by Queue please report the issue to our support forum.

Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD


  • delagoyadelagoya Member Posts: 1

    You should use h_vmem instead of or along with mem_free for the qsub submission examples above. mem_free only checks memory usage at the time of first entering running status, which is OK for short-lived processes, but not for long-lived ones, where memory usage can grow over time.

    E.g. qsub -l h_vmem=16G,mem_free=16G ...

  • yfarjounyfarjoun Broad InstituteDev Posts: 55
    edited May 2013

    You mean to use public/scala/qscript/org/broadinstitute/sting/queue/qscripts/examples/HelloWorld.scala

  • redzengenoistredzengenoist Member Posts: 27
    edited February 2014

    Hello there,

    I've got an issue running scatter-gather on gridengine 6.2u5, redhat.

    When I first ran it, it reported missing, so I did a clusterwide search, and found the admins version That meant that I could finally run basic hello world scripts, such as the below:

    `$    java$temp \
           -jar $queu -jobRunner GridEngine \
           -S $home/QUEUETools/newest/resources/ExampleUnifiedGenotyper.scala \
           -R $home/QUEUETools/newest/resources/exampleFASTA.fasta \
           -I $home/QUEUETools/newest/resources/exampleBAM.bam -run
    `INFO  18:31:07,505 QScriptManager - Compiling 1 QScript
    INFO  18:31:13,574 QScriptManager - Compilation complete
    INFO  18:31:13,697 HelpFormatter - ----------------------------------------------------------------------
    INFO  18:31:13,697 HelpFormatter - Queue v2.7-2-g6bda569, Compiled 2013/08/28 16:33:34
    INFO  18:31:13,697 HelpFormatter - Copyright (c) 2012 The Broad Institute
    INFO  18:31:13,697 HelpFormatter - For support and documentation go to
    INFO  18:31:13,698 HelpFormatter - Program Args: -jobRunner GridEngine -S /xxx/QUEUETools/newest/resources/ExampleUnifiedGenotyper.scala -R 
    /xxx/QUEUETools/newest/resources/exampleFASTA.fasta -I 
    /xxx/QUEUETools/newest/resources/exampleBAM.bam -run
    INFO  18:31:13,698 HelpFormatter - Date/Time: 2014/02/03 18:31:13
    INFO  18:31:13,698 HelpFormatter - ----------------------------------------------------------------------
    INFO  18:31:13,699 HelpFormatter - ----------------------------------------------------------------------
    INFO  18:31:13,708 QCommandLine - Scripting ExampleUnifiedGenotyper
    INFO  18:31:13,844 QCommandLine - Added 2 functions
    INFO  18:31:13,844 QGraph - Generating graph.
    INFO  18:31:13,872 QGraph - Generating scatter gather jobs.
    INFO  18:31:13,903 QGraph - Removing original jobs.
    INFO  18:31:13,907 QGraph - Adding scatter gather jobs.
    INFO  18:31:14,688 QGraph - Regenerating graph.
    INFO  18:31:14,706 QGraph - Running jobs.
    INFO  18:31:15,322 QGraph - 0 Pend, 0 Run, 0 Fail, 7 Done
    INFO  18:31:16,379 QCommandLine - Writing final jobs report...
    INFO  18:31:16,380 QJobsReporter - Writing JobLogging GATKReport to file /xxx/QUEUETools/Queue_2.7.2/resources/ExampleUnifiedGenotyper.jobreport.txt
    INFO  18:31:16,635 QJobsReporter - Plotting JobLogging GATKReport to file /xxx/QUEUETools/Queue_2.7.2/resources/ExampleUnifiedGenotyper.jobreport.pdf
    WARN  18:31:16,648 RScriptExecutor - Skipping: Rscript (resource)org/broadinstitute/sting/queue/util/queueJobReport.R /xxx/QUEUETools/Queue_2.7.2/resources/ExampleUnifiedGenotyper.jobreport.txt /xxx/QUEUETools/Queue_2.7.2/resources/ExampleUnifiedGenotyper.jobreport.pdf
    INFO  18:31:16,655 QCommandLine - Script completed successfully with 7 total jobs`

    So, that's fine.

    However, when I try to run basically the same script on actual BAM files, I get this error:

    `$       java$temp \
           -jar $queu -jobRunner GridEngine \
           -S $home/QUEUETools/newest/resources/ExampleUnifiedGenotyper.scala \
           -R $dxfa \
           -I $gatr/bamlists/currentrecalbams.test2.list -run`  
    INFO  18:36:22,307 QGraph - Generating scatter gather jobs.
    INFO  18:36:22,338 QGraph - Removing original jobs.
    INFO  18:36:22,341 QGraph - Adding scatter gather jobs.
    INFO  18:36:23,164 QGraph - Regenerating graph.
    INFO  18:36:23,200 QGraph - Running jobs.
    INFO  18:36:27,499 FunctionEdge - Starting: LocusScatterFunction: List(/share/XFS0016/gata/bamlists/currentrecalbams.test2.list, /ifshk7/ST_PG/PMO/SZY11098/indx/GATKh19bundle/ucsc.hg19.fasta) > List(/ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_1_of_3/scatter.intervals, /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_2_of_3/scatter.intervals, /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_3_of_3/scatter.intervals)
    INFO  18:36:27,499 FunctionEdge - Output written to /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/scatter/scatter.out
    INFO  18:36:28,067 QGraph - 6 Pend, 1 Run, 0 Fail, 0 Done
    INFO  18:36:58,383 FunctionEdge - Done: LocusScatterFunction: List(/share/XFS0016/gata/bamlists/currentrecalbams.test2.list, /ifshk7/ST_PG/PMO/SZY11098/indx/GATKh19bundle/ucsc.hg19.fasta) > List(/ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_1_of_3/scatter.intervals, /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_2_of_3/scatter.intervals, /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_3_of_3/scatter.intervals)
    INFO  18:36:58,387 QGraph - Writing incremental jobs reports...
    INFO  18:36:58,388 QJobsReporter - Writing JobLogging GATKReport to file /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/ExampleUnifiedGenotyper.jobreport.txt
    INFO  18:36:58,610 FunctionEdge - Starting:  'java'  '-Xmx2048m'  '-XX:+UseParallelOldGC'  '-XX:ParallelGCThreads=4'  '-XX:GCTimeLimit=50'  '-XX:GCHeapFreeLimit=10'  ''  '-cp' '/xxx/QUEUETools/newest/Queue.jar'  'org.broadinstitute.sting.gatk.CommandLineGATK'  '-T' 'UnifiedGenotyper'  '-I' '/share/XFS0016/gata/bamlists/currentrecalbams.test2.list'  '-L' '/ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_1_of_3/scatter.intervals'  '-R' '/ifshk7/ST_PG/PMO/SZY11098/indx/GATKh19bundle/ucsc.hg19.fasta'  '-o' '/ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_1_of_3/currentrecalbams.test2.listunfiltered.vcf'
    INFO  18:36:58,611 FunctionEdge - Output written to /ifshk5/PC_HUMAN_AP/PMO/SZY11098_HUMbjjR/QUEUETools/Queue_2.7.2/resources/.queue/scatterGather/ExampleUnifiedGenotyper-1-sg/temp_1_of_3/currentrecalbams.test2.listunfiltered.vcf.out
    **ERROR** 18:36:58,890 Retry - Caught error during attempt 1 of 4.
    org.broadinstitute.sting.queue.QException: Unable to submit job: error: no suitable queues

    I know what value to enter in the queue field: the default queue test-command gives the same error:

                `qsub -w e -V -b y -N echo_hello_world -l vf=4G -o test.out -wd $PWD -j y echo hello world`
                Unable to run job: error: no suitable queues.

    Which I can thusly correct:

       `qsub -w e -V -b y -N echo_hello_world -l vf=5G -q st.q -P st_pg vf=4G -o test.out -cwd -j y echo hello world`
       Your job 990540 ("echo_hello_world") has been submitted

    My question is, how do I edit the default parameters of drmaa / queue, to use my desired -q parameter? I can't edit .so files, it seems.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    Hi there,

    We don't work with DRMAA so I can't help you, but perhaps one of our resident superusers such as @pdexheimer or @Johan_Dahlberg will be able to jump in with an answer.

    Geraldine Van der Auwera, PhD

  • pdexheimerpdexheimer Member, Dev Posts: 543 ✭✭✭✭

    There's a global -jobQueue argument (i.e., java -jar Queue.jar -s script.scala -jobQueue st.q …), but it looks like the DRMAA runner never uses it. Unfortunately, I don't know anything about DRMAA either, so I don't know exactly how to make the fix

  • thibaultthibault Broad InstituteMember, Broadie, Dev Posts: 36 ✭✭

    As a workaround you can try Queue's --jobNative argument (or the equivalent QFunction property .jobNativeArgs) to pass arguments directly to DRMAA.

    Joel Thibault ~ Software Engineer ~ GSA ~ Broad Institute

  • Johan_DahlbergJohan_Dahlberg Member Posts: 96 ✭✭✭

    Yes. I can second @thibaults solution. However, it depends on then drmaa specification if it will work or not since they seem to handle the jobNative arguments quite differently.

  • redzengenoistredzengenoist Member Posts: 27

    That sounds very promising, actually.

    I've narrowed it down, such that I actually will not need the qsub -q argument, all that I need is a -P argument (qsub -P st_pg).

    However, I'm not sure how to syntax native_arg. When I write it like this:

    java$temp -jar $queu -jobRunner GridEngine -S ExampleCountReads.scala -R exampleFASTA.fasta -I exampleBAM.bam -jobNative -P st_pg -memLimit 2 -run

    I get this:

    INFO  14:20:11,924 QScriptManager - Compiling 1 QScript
    INFO  14:20:17,726 QScriptManager - Compilation complete
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR stack trace
    Argument with name 'P' isn't defined.
            at org.broadinstitute.sting.commandline.ParsingEngine.validate(
            at org.broadinstitute.sting.commandline.ParsingEngine.validate(
            at org.broadinstitute.sting.commandline.CommandLineProgram.start(
            at org.broadinstitute.sting.commandline.CommandLineProgram.start(
            at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:62)
            at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala)
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
    ##### ERROR
    ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ##### ERROR Visit our website and forum for extensive documentation and answers to
    ##### ERROR commonly asked questions
    ##### ERROR
    ##### ERROR MESSAGE: Argument with name 'P' isn't defined.
    ##### ERROR ------------------------------------------------------------------------------------------

    Can anybody guess what the argument format is supposed to be?

  • redzengenoistredzengenoist Member Posts: 27
    edited February 2014

    Ah - I continued to play with it, and I just had to format the argument as a string:

    -jobNative "-P st_pg -l vf=6G etc etc etc"

    Thanks to @thibault and @Johan_Dahlberg, you guys are brilliant. I've got queue working and the pertinent jobs submitted.

    Post edited by redzengenoist on
  • DavidRiesDavidRies Member Posts: 15

    queue in general works fine for me on the GridEngine. There is a little performance tweak I would like to suggest.
    At the moment, the GridEngineJobRunner.scala forces "the remote environment to inherit local environment settings".
    That might be a goo idea in general, to make sure the jobs get all they need, but with hundreds of clustered jobs, this unnecessarily
    slows down the system. I'm not much of a scala programmer (yet), so I don't see a way to turn the -V flag off, other than doing it manually in the source code and compile the whole thing.
    A nice thing would be the possibility to set the inheritance to false.
    Maybe @pdexheimer or @Johan_Dahlberg know a solution?

  • pdexheimerpdexheimer Member, Dev Posts: 543 ✭✭✭✭

    @DavidRies‌ - As you suggest, the -V parameter is always set for GridEngine jobs. You're right, at the moment you'd have to remove it in the code and recompile Queue to get rid of it.

    The solution would be to add another argument to QSettings, then conditionally add -V to nativeSpec depending on the contents of that argument. However, adding a runner-specific argument to the global QSettings wouldn't be great - it should really be something applicable to any runner in general. I'm not certain exactly what -V does (beyond what's in the comment, of course), so I'm not sure if it's an easily generalizable concept

  • mxqianmxqian Member Posts: 11

    @Geraldine_VdAuwera - Running Queue for LSF jobs, how to pass the parameter like "-n 3"? I used -jobNative "-n 3", but seems that didn't work. Is there any way to do that? Thanks.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin

    I'm not sure, @mxqian. I never use it that way. Hopefully someone else in this thread will jump in to help.

    Geraldine Van der Auwera, PhD

  • pdexheimerpdexheimer Member, Dev Posts: 543 ✭✭✭✭


    You would set the CommandLineFunction.nCoresRequest field. For example, in your case class for a particular job (like IndelRealigner or HaplotypeCaller), you would specify this.nCoresRequest = 3

    However, I would suggest that in practice it's generally better to increase the scatterCount than it is to run multi-threaded

  • mxqianmxqian Member Posts: 11

    @pdexheimer Great. Thank you so much.

  • cfrisupportcfrisupport Vancouver, BC, CanadaMember Posts: 1

    Can confirm it works with Son of Grid Engine 8.1.8 on CentOS 6.7

    Issue · Github
    by Sheila

    Issue Number
    Last Updated
    Closed By
  • DavidRiesDavidRies Member Posts: 15

    I can confirm it works with Univa Grid Engine 8.3.0.

Sign In or Register to comment.