Nested CommandLineFunction?

I have something like this:

case class Command1 (param1, param2) extends CommandLineFunction{ println(This is Command1)}
case class Command2(param1, param2, param3) extends CommandLineFunction{ Command1(param1, param2)}

The call of Command1 in Command2 seems to be skipped. How should I do properly in this case? The motivation is because Command2 is too long, and complicated, so I want to split the work into smaller steps.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    I'm confused by the case statement -- what is the intention here in terms of program logic?

  • I've not been thinking in Scala lately and can't spend the time to refresh my memory right now, so I'll just assume that your invocation will somehow magically override commandLine in the case classes.

    But it would seem to me that the answer to your question is to generalize it out of the CommandLineFunction structure - after all, Queue was specifically written to manage long, complicated workflows. So just have a method in your QScript that add()s Command1 and Command2 as necessary, no nesting required

  • That's good reminder! @pdexheimer . I did not add it to the qscript. I dropped this nesting pattern for now. I still keep the long version of Command2, where my command looks like this override def commandLine = cmd1 -param11 -param12, && cmd2 -param21 -param22, && ...&& some_chain |some_pipe...&& cmd7 -param71, -param72

    @Geraldine_VdAuwera: I am not sure what you are asking. Following is from scala spec (http://www.scala-lang.org/files/archive/spec/2.11/05-classes-and-objects.html). Basically it allows me to instantiate Commandx classes like a function, e.g: `Command1(param1, param2). It is just a habbit.

    If a class definition is prefixed with case, the class is said to be a case class.

    The formal parameters in the first parameter section of a case class are called elements; they are treated specially. First, the value of such a parameter can be extracted as a field of a constructor pattern. Second, a val prefix is implicitly added to such a parameter, unless the parameter carries already a val or var modifier. Hence, an accessor definition for the parameter is generated.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    My bad, I'm not familiar enough with scala. You're in much better hands with @pdexheimer :)

  • I don't understand why you'd want to run multiple programs in a single CommandLineFunction. I would much rather have cmd1 through cmd7 submitted as separate jobs to the cluster and run in parallel over (most likely) multiple nodes - this is the use case that Queue was designed for. But if you're happy, then I guess we're in good shape

  • @pdexheimer Actually that is not what I want, but rather the only solution to date that I can come up with from my Linux commandline knowledge, and the lack of understanding about how to use CommandLineFunction API. Since I want to do series of steps to the same set of input file, I think it is more compact to chain all of the commands in one long chain. Like so:

    # Alignment
    /tools/bin/novoalignCS -d /genomes/rn6/novocraft/rn6.rnaseq.n60k.cnx -f /data/L02/result/WRNA00001_FC2_B1_L02.xsq -F 'XSQ' LC4  -o SAM -r Random -k -t 20,2.5 -p 5,15 0.35,10  -c 24  2> /output/LC4.2.LC4.2.log \
    # Save bam file
    | /tools/bin/samtools view -Sb -F4 - > /output/LC4.2.LC4.2_unsorted.bam && \
    touch /output/LC4.2.LC4.2_unsorted.done && \
    # Convert RNAseq alignment to chromosome coordinates
    mkfifo /output/LC4.2.LC4.2_unsorted.fifo.sam && \
    sh -c '/tools/bin/samtools view -h /output/LC4.2.LC4.2_unsorted.bam >/output/LC4.2.LC4.2_unsorted.fifo.sam&' && \
    /tools/USeq/SamTranscriptomeParser -f /output/LC4.2.LC4.2_unsorted.fifo.sam -a 900 -n 100 -u -s /output/LC4.2.LC4.2_unsorted.tmp.bam && \
    mv -f /output/LC4.2.LC4.2_unsorted.tmp.bam /output/LC4.2.LC4.2_unsorted.bam && \
    rm -f /output/LC4.2.LC4.2_unsorted.fifo.sam && \
    touch /output/LC4.2.LC4.2_convert.done && \
    # Sort, Index, and attach ReadGroup info to BAM file
    /tools/bin/novosort -i -c  24 -m 7G  -t ./.queue/tmp --rg "@RG\tID:2.LC4.2\\tSM:LC4\\tCN:AfMD\\tLB:LC4_8\\tPL:SOLiD\\tPU:2.LC4.2" -o /output/LC4.2.LC4.2.bam /output/LC4.2.LC4.2_unsorted.bam 
    

    This looks horribly long, especially when my acutal paths are not as short as /output, /data or /tools. At the Scala source code, however, it does not look that messy. To keep the comand short, I use many temporary var/val variables and string interpolation instead of concatenation like so s"This is 'var' value: $var". ANYWAYS, I would really love to see how to do it properly in Queue. So it would be great if you could take a look at this class and give me your suggestions for improvement:

    https://github.com/biocyberman/piper/blob/QPipe/src/main/scala/molmed/utils/AlignmentUtilsNovocraft.scala#L156

Sign In or Register to comment.