Queue: Best way to use filename prefix?

dklevebringdklevebring Posts: 72Member
edited March 3 in Ask the GATK team

I'm working on add RSEM to our RNAseq pipeline which uses Queue. RSEM takes a number of inputs on the command line, so I have a case class and override commandLine for this to work. Nothing special there.

However, RSEM wants a prefix of the output sample names. If i give it sample_name, it will generate a whole bunch of files, sample_name.genes.results with expression values for genes, sample_name.isoforms.results with expression values for isoforms, sample_name.genome.bam, sample_name.genome.sorted.bam and sample_name.genome.sorted.bam.bai with mappings etc, etc.

What's the best way to handle this in terms of @Output?

Should I use (1):

case class rsem(inFq1: File, inFq2: File, prefix: String) extends ExternalCommonArgs {
   ...
   @Output val myPrefix = prefix
   ...

and them use the prefix in the downstream jobs? Or should I use (2):

case class rsem(inFq1: File, inFq2: File, prefix: String, bam: File, geneResults: File) extends ExternalCommonArgs {
   ...
   @Output val myBam = bam
   @Output val myGeneRes = geneResults
   ...

In (2), I would still use prefix in the def commandLine, of course.

Is there a preferred way to handle this in Queue?

Post edited by dklevebring on

Best Answer

Answers

  • dklevebringdklevebring Posts: 72Member
    edited March 3

    I'm using Queue as a dependency tracker and job scheduler, so I'm not looking to scatter/gather. The most important thing is that I get correct dependencies.

    In the case you describe, what does the case class parameters look like? prefix is there I guess, but is myBam?

    Post edited by dklevebring on
  • pdexheimerpdexheimer Posts: 385Member, GSA Collaborator ✭✭✭

    No, I don't think so - unless you can force rsem to use different values for the suffixes, then I don't see a reason to allow them to change in your case class

Sign In or Register to comment.