Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Queue: Best way to use filename prefix?

dklevebringdklevebring Member
edited March 2014 in Ask the GATK team

I'm working on add RSEM to our RNAseq pipeline which uses Queue. RSEM takes a number of inputs on the command line, so I have a case class and override commandLine for this to work. Nothing special there.

However, RSEM wants a prefix of the output sample names. If i give it sample_name, it will generate a whole bunch of files, sample_name.genes.results with expression values for genes, sample_name.isoforms.results with expression values for isoforms, sample_name.genome.bam, sample_name.genome.sorted.bam and sample_name.genome.sorted.bam.bai with mappings etc, etc.

What's the best way to handle this in terms of @Output?

Should I use (1):

case class rsem(inFq1: File, inFq2: File, prefix: String) extends ExternalCommonArgs {
   ...
   @Output val myPrefix = prefix
   ...

and them use the prefix in the downstream jobs? Or should I use (2):

case class rsem(inFq1: File, inFq2: File, prefix: String, bam: File, geneResults: File) extends ExternalCommonArgs {
   ...
   @Output val myBam = bam
   @Output val myGeneRes = geneResults
   ...

In (2), I would still use prefix in the def commandLine, of course.

Is there a preferred way to handle this in Queue?

Best Answer

Answers

  • dklevebringdklevebring Member
    edited March 2014

    I'm using Queue as a dependency tracker and job scheduler, so I'm not looking to scatter/gather. The most important thing is that I get correct dependencies.

    In the case you describe, what does the case class parameters look like? prefix is there I guess, but is myBam?

  • pdexheimerpdexheimer Member ✭✭✭✭

    No, I don't think so - unless you can force rsem to use different values for the suffixes, then I don't see a reason to allow them to change in your case class

Sign In or Register to comment.