Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Frequently asked questions about QScripts

Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,285Administrator, GSA Member admin
edited February 3 in Queue

1. Many of my GATK functions are setup with the same Reference, Intervals, etc. Is there a quick way to reuse these values for the different analyses in my pipeline?

Yes.

  • Create a trait that extends from CommandLineGATK.
  • In the trait, copy common values from your qscript.
  • Mix the trait into instances of your classes.

For more information, see the ExampleUnifiedGenotyper.scala or examples of using Scala's traits/mixins illustrated in the QScripts documentation.

2. How do I accept a list of arguments to my QScript?

In your QScript, define a var list and annotate it with @Argument. Initialize the value to Nil.

@Argument(doc="filter names", shortName="filter")
var filterNames: List[String] = Nil

On the command line specify the arguments by repeating the argument name.

-filter filter1 -filter filter2 -filter filter3

Then once your QScript is run, the command line arguments will be available for use in the QScript's script method.

  def script {
     var myCommand = new MyFunction
     myCommand.filters = this.filterNames
  }

For a full example of command line arguments see the QScripts documentation.

3. What is the best way to run a utility method at the right time?

Wrap the utility with an InProcessFunction. If your functionality is reusable code you should add it to Sting Utils with Unit Tests and then invoke your new function from your InProcessFunction. Computationally or memory intensive functions should NOT be implemented as InProcessFunctions, and should be wrapped in Queue CommandLineFunctions instead.

    class MySplitter extends InProcessFunction {
      @Input(doc="inputs")
      var in: File = _

      @Output(doc="outputs")
      var out: List[File] = Nil

      def run {
         StingUtilityMethod.quickSplitFile(in, out)
      }
    }

    var splitter = new MySplitter
    splitter.in = new File("input.txt")
    splitter.out = List(new File("out1.txt"), new File("out2.txt"))
    add(splitter)

See Queue CommandLineFunctions for more information on how @Input and @Output are used.

4. What is the best way to write a list of files?

Create an instance of a ListWriterFunction and add it in your script method.

import org.broadinstitute.sting.queue.function.ListWriterFunction
</pre>

<pre>
    val writeBamList = new ListWriterFunction
    writeBamList.inputFiles = bamFiles
    writeBamList.listFile = new File("myBams.list")
    add(writeBamList)

5. How do I add optional debug output to my QScript?

Queue contains a trait mixin you can use to add Log4J support to your classes.

Add the import for the trait Logging to your QScript.

import org.broadinstitute.sting.queue.util.Logging

Mixin the trait to your class.

class MyScript extends Logging {
...

Then use the mixed in logger to write debug output when the user specifies -l DEBUG.

logger.debug("This will only be displayed when debugging is enabled.")

6. I updated Queue and now I'm getting java.lang.NoClassDefFoundError / java.lang.AbstractMethodError

Try ant clean.

Queue relies on a lot of Scala traits / mixins. These dependencies are not always picked up by the scala/java compilers leading to partially implemented classes. If that doesn't work please let us know in the forum.

7. Do I need to create directories in my QScript?

No. QScript will create all parent directories for outputs.

8. How do I specify the -W 240 for the LSF hour queue at the Broad?

Queue's LSF dispatcher automatically looks up and sets the maximum runtime for whichever LSF queue is specified. If you set your -jobQueue/.jobQueue to hour then you should see something like this under bjobs -l:

RUNLIMIT
240.0 min of gsa3

9. Can I run Queue with GridEngine?

Queue GridEngine functionality is community supported. See here for full details: Queue with Grid Engine.

10. How do I pass advanced java arguments to my GATK commands, such as remote debugging?

The easiest way to do this at the moment is to mixin a trait.

First define a trait which adds your java options:

  trait RemoteDebugging extends JavaCommandLineFunction {
    override def javaOpts = super.javaOpts + " -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"
  }

Then mix in the trait to your walker and otherwise run it as normal:

  val printReadsDebug = new PrintReads with RemoteDebugging
  printReadsDebug.reference_sequence = "my.fasta"
  // continue setting up your walker...
  add(printReadsDebug)

11. Why does Queue log "Running jobs. ... Done." but doesn't actually run anything?

If you see something like the following, it means that Queue believes that it previously successfully generated all of the outputs.

INFO 16:25:55,049 QCommandLine - Scripting ExampleUnifiedGenotyper 
INFO 16:25:55,140 QCommandLine - Added 4 functions 
INFO 16:25:55,140 QGraph - Generating graph. 
INFO 16:25:55,164 QGraph - Generating scatter gather jobs. 
INFO 16:25:55,714 QGraph - Removing original jobs. 
INFO 16:25:55,716 QGraph - Adding scatter gather jobs. 
INFO 16:25:55,779 QGraph - Regenerating graph. 
INFO 16:25:55,790 QGraph - Running jobs. 
INFO 16:25:55,853 QGraph - 0 Pend, 0 Run, 0 Fail, 10 Done 
INFO 16:25:55,902 QCommandLine - Done 

Queue will not re-run the job if a .done file is found for the all the outputs, e.g.: /path/to/.output.file.done. You can either remove the specific .done files yourself, or use the -startFromScratch command line option.

Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD

Sign In or Register to comment.