The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.2 is now available. As of 2.10.0, Picard supports NovaSeq CBCL data. Download and read release notes at https://github.com/broadinstitute/picard/releases.
**GATK4-BETA.2** is here. That's TWO, as in the second beta release. Be sure to read about the known issues before test driving. See Article#9881 to start and https://github.com/broadinstitute/gatk/blob/master/README.md for details.

scala script for running Queue

I am trying to build Queue from Sting package downloaded from Github, but the ant building process always fails with different errors. I wonder if there's any alternative way to build Queue. Is there any scala script available that I can study or customize for automating GATK runs?

Tagged:

Best Answers

Answers

  • I'll just chip in that there are some more examples on how to write qscripts available here: https://github.com/johandahlberg/piper

    They contain some more practical applications, but they are a lot more messy then the ones pointed to by Geraldine.

  • rcholicrcholic DenverMember

    Thanks to Geraldine and Johan! After reading some of your sample QScripts, I now kind of have an idea how to write the scripts. But I have one more question: say, with this sample script located at https://github.com/broadgsa/gatk/blob/master/public/scala/qscript/org/broadinstitute/sting/queue/qscripts/examples/ExampleCustomWalker.scala,

    1. how do I feed it with a list of files as input?
    2. how to name the output files by appending a suffix like "-sorted" to the input file names?

    Thanks again!

  • rcholicrcholic DenverMember
    edited October 2013

    I am trying to write my first QScript to sort Bam files with PiCard. I have imported the jar file Queue.jar to my Scala-Eclipse IDE. With that, I write the following script, but there're type mismatches as commented below. I need help with using the correct data type, I don't have the Queue javadoc, as my computer cannot compile/build the source (why not put it online???)

     import org.broadinstitute.sting.queue.QScript
     import org.broadinstitute.sting.queue.extensions.picard.SortSam
    
    
     class sortSamPicard extends QScript
     { 
       @Input(doc = "Bam files to sort", shortName="I")
       var bamFile: Seq[File] = Nil
    
       @Output(doc = "sorted Bam Files", shortName="O")
       var sortedBamFile: File = _
    
       @Argument(doc = "sort order", fullName="SORT_ORDER")
       var sortingOrder: String="coordinate"  // type mismatch, what type should this be?
    
       def script() {
        val SortSam = new SortSam
        SortSam.input = bamFile
        SortSam.output = sortedBamFile
        SortSam.sortOrder = sortingOrder //type mismatch
        SortSam.outputDirectories = "sorted_Bam/"      // type mismatch, what type is the outputDirectories?     
        add(SortSam)
    
       }
    

    After correcting the type mismatches, can I run the script with Queue.jar? I'm assung that add(SortSam) will do all the tricks/magic for me. Thank you

    Post edited by Geraldine_VdAuwera on
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Although I would recommend using a name that's not exactly the same as the wrapper for your object -- we typically lower the case on the first letter, e.g. do val sortSam = new SortSam.

  • rcholicrcholic DenverMember

    Thank you so much Johan! The script is working! But don't know how to feed multiple inputs (multiple bam files), because I am not clear how to construct a Seq[File] structure. Should I do it like this:

    java -Djava.io.tmpdir=tmp -jar $CLASSPATH/Queue-2.72/Queue.jar -S SortSamPicard.scala -I Seq[file1.bam, file2.bam, file3.bam] --SORT_ORDER coordinate

    thanks a lot!

  • rcholicrcholic DenverMember

    Thanks Geraldine. That's what I tried after posting my last question, and it worked. awesome Queue :)

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Glad it's working for you :)

    We're doing a half-day workshop on Queue on 21-22 October; if you can't make it here be sure to keep an eye out for the materials, which will be online shortly thereafter. We will provide brand new docs on Queue, from simple usage to advanced functions.

  • rcholicrcholic DenverMember

    Geraldine: I live in Colorado and cannot make it for the workshop, although I'd love to. I look forward to watching the videos online. Queue will make my work more efficient. Thanks to all of you guys for offering GATK and Queue and help.

  • rcholicrcholic DenverMember

    One more question, about the results of the sorted bam using the above QScript:
    <br /> -rw-r--r-- 1 root wheel 5.3G Oct 2 22:57 file1-MEM-PE.sorted.bam<br /> -rw-r--r-- 1 root wheel 6.9M Oct 2 22:57 file1-MEM-PE.sorted.bai<br /> -rw-r--r-- 1 root wheel 31G Oct 1 13:26 file1-MEM-PE.bam // this is the original file, before sorting</p> <p>

    You can see that the original bam file is 31GB, but the sorted bam file is only 5.3GB. This huge difference made me wonder if there's anything wrong with the run?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hmm, that does seem a little extreme. Maybe the compression level of your original file was very low? I would try running SortSam on the file outside of Queue to compare results and check that nothing funky is going on.

  • rcholicrcholic DenverMember

    @Geraldine: I used BWA to do the alignment,which generated the original Bam files. I guess these files were not compressed at all. I'll try to run SortSam using PiCard directly.

Sign In or Register to comment.