The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
Register now for the upcoming GATK Best Practices workshop, Nov 7-8 at the Broad in Cambridge, MA. Open to all comers! More info and signup at

outputDir - only applies for cohort.list im DaraProcessingPipeline.scala

Johan_DahlbergJohan_Dahlberg Posts: 96Member ✭✭✭
edited August 2012 in Ask the GATK team

Looking at the DataProcessingPipeline script I noticed that the "outputDir" is only applied to the cohort list file, but not to the actually processed files, which seems a bit inconsistent with the docs, which say "Output path for the processed BAM files". Here is my fix for this:

diff --git a/public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala b/public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala
index 56f6460..6dd84b5 100755
--- a/public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala
+++ b/public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala
@@ -249,8 +249,12 @@ class DataProcessingPipeline extends QScript {
     // put each sample through the pipeline
     for ((sample, bamList) <- sampleBAMFiles) {

-      // BAM files generated by the pipeline
-      val bam        = new File(qscript.projectName + "." + sample + ".bam")
+      // BAM files generated by the pipeline      
+      val bam        = if(outputDir.isEmpty()) 
+                                      new File(qscript.projectName + "." + sample + ".bam")
+                                  else
+                                      new File(outputDir + qscript.projectName + "." + sample + ".bam")
       val cleanedBam = swapExt(bam, ".bam", ".clean.bam")
       val dedupedBam = swapExt(bam, ".bam", ".clean.dedup.bam")
       val recalBam   = swapExt(bam, ".bam", ".clean.dedup.recal.bam")
@@ -292,6 +296,15 @@ class DataProcessingPipeline extends QScript {
     add(writeList(cohortList, cohortFile))

+  // Override the normal swapExt metod by adding the outputDir to the file path by default if it is defined.
+  override
+  def swapExt(file: File, oldExtension: String, newExtension: String) = {
+      if(outputDir.isEmpty())
+         new File(file.getName.stripSuffix(oldExtension) + newExtension)
+      else
+          swapExt(outputDir, file, oldExtension, newExtension);
+  }      


This of course, puts all the files in the directory specified by outputDir, but I think that this seems reasonable than putting them in the execution directory of the script.


Sign In or Register to comment.