scala script for running Queue

rcholicrcholic DenverPosts: 68Member

I am trying to build Queue from Sting package downloaded from Github, but the ant building process always fails with different errors. I wonder if there's any alternative way to build Queue. Is there any scala script available that I can study or customize for automating GATK runs?


Best Answers


  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,733Administrator, GATK Dev admin
  • Johan_DahlbergJohan_Dahlberg Posts: 94Member ✭✭✭

    I'll just chip in that there are some more examples on how to write qscripts available here:

    They contain some more practical applications, but they are a lot more messy then the ones pointed to by Geraldine.

  • rcholicrcholic DenverPosts: 68Member

    Thanks to Geraldine and Johan! After reading some of your sample QScripts, I now kind of have an idea how to write the scripts. But I have one more question: say, with this sample script located at,

    1. how do I feed it with a list of files as input?
    2. how to name the output files by appending a suffix like "-sorted" to the input file names?

    Thanks again!

  • rcholicrcholic DenverPosts: 68Member
    edited October 2013

    I am trying to write my first QScript to sort Bam files with PiCard. I have imported the jar file Queue.jar to my Scala-Eclipse IDE. With that, I write the following script, but there're type mismatches as commented below. I need help with using the correct data type, I don't have the Queue javadoc, as my computer cannot compile/build the source (why not put it online???)

     import org.broadinstitute.sting.queue.QScript
     import org.broadinstitute.sting.queue.extensions.picard.SortSam
     class sortSamPicard extends QScript
       @Input(doc = "Bam files to sort", shortName="I")
       var bamFile: Seq[File] = Nil
       @Output(doc = "sorted Bam Files", shortName="O")
       var sortedBamFile: File = _
       @Argument(doc = "sort order", fullName="SORT_ORDER")
       var sortingOrder: String="coordinate"  // type mismatch, what type should this be?
       def script() {
        val SortSam = new SortSam
        SortSam.input = bamFile
        SortSam.output = sortedBamFile
        SortSam.sortOrder = sortingOrder //type mismatch
        SortSam.outputDirectories = "sorted_Bam/"      // type mismatch, what type is the outputDirectories?     

    After correcting the type mismatches, can I run the script with Queue.jar? I'm assung that add(SortSam) will do all the tricks/magic for me. Thank you

    Post edited by Geraldine_VdAuwera on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,733Administrator, GATK Dev admin

    Although I would recommend using a name that's not exactly the same as the wrapper for your object -- we typically lower the case on the first letter, e.g. do val sortSam = new SortSam.

    Geraldine Van der Auwera, PhD

  • rcholicrcholic DenverPosts: 68Member

    Thank you so much Johan! The script is working! But don't know how to feed multiple inputs (multiple bam files), because I am not clear how to construct a Seq[File] structure. Should I do it like this:

    java -jar $CLASSPATH/Queue-2.72/Queue.jar -S SortSamPicard.scala -I Seq[file1.bam, file2.bam, file3.bam] --SORT_ORDER coordinate

    thanks a lot!

  • rcholicrcholic DenverPosts: 68Member

    Thanks Geraldine. That's what I tried after posting my last question, and it worked. awesome Queue :)

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,733Administrator, GATK Dev admin

    Glad it's working for you :)

    We're doing a half-day workshop on Queue on 21-22 October; if you can't make it here be sure to keep an eye out for the materials, which will be online shortly thereafter. We will provide brand new docs on Queue, from simple usage to advanced functions.

    Geraldine Van der Auwera, PhD

  • rcholicrcholic DenverPosts: 68Member

    Geraldine: I live in Colorado and cannot make it for the workshop, although I'd love to. I look forward to watching the videos online. Queue will make my work more efficient. Thanks to all of you guys for offering GATK and Queue and help.

  • rcholicrcholic DenverPosts: 68Member

    One more question, about the results of the sorted bam using the above QScript:

    -rw-r--r-- 1 root wheel 5.3G Oct 2 22:57 file1-MEM-PE.sorted.bam
    -rw-r--r-- 1 root wheel 6.9M Oct 2 22:57 file1-MEM-PE.sorted.bai
    -rw-r--r-- 1 root wheel 31G Oct 1 13:26 file1-MEM-PE.bam // this is the original file, before sorting

    You can see that the original bam file is 31GB, but the sorted bam file is only 5.3GB. This huge difference made me wonder if there's anything wrong with the run?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,733Administrator, GATK Dev admin

    Hmm, that does seem a little extreme. Maybe the compression level of your original file was very low? I would try running SortSam on the file outside of Queue to compare results and check that nothing funky is going on.

    Geraldine Van der Auwera, PhD

  • rcholicrcholic DenverPosts: 68Member

    @Geraldine: I used BWA to do the alignment,which generated the original Bam files. I guess these files were not compressed at all. I'll try to run SortSam using PiCard directly.

Sign In or Register to comment.