Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Run GenomeSTRiP with Torque/PBS job scheduler

Hello, Bob,

Genome STRiP pipeline Queue scripts work on LSF scheduler or on SGE scheduler. But our Linux cluster uses Torque/PBS job scheduler. Is there a guideline to use GATK/SVToolkit commandline directly or without using the Scala/Queue pipelining?

Best,
Guangfa

Answers

  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    I think there are a couple of choices, depending on what you want to do.

    If you just want to do a single analysis, people have had good success running Queue without -run ("dry run" mode).
    This emits all of the commands Queue would run.
    I know people have scripted converting this output into a shell script (mostly automated, I think) and this strategy has been pretty practical for a single analysis.

    If you want to build this into your pipeline for multiple runs, a second option is to use "dry run" (as above) and see what it does and then use this as a guide to implement the pipelines in your job scheduler.
    The downside to this approach is that as Genome STRiP changes, you'll have a support burden to keep your pipeline up to date.

    A third option is to look into adding your job scheduler to Queue. I haven't looked at the Queue code, but I believe there was some attempt to make the job schedulers pluggable when Queue was extended to handle SGE in addition to LSF.
    It might not be too hard to add your own job scheduler.
    And if you do this, you could ask the GATK group if they would accept a patch so others could use it as well.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    That's right, we do accept patches for things like this.

  • xiayanokxiayanok chinaMember

    @bhandsaker said:
    I think there are a couple of choices, depending on what you want to do.

    If you just want to do a single analysis, people have had good success running Queue without -run ("dry run" mode).
    This emits all of the commands Queue would run.
    I know people have scripted converting this output into a shell script (mostly automated, I think) and this strategy has been pretty practical for a single analysis.

    If you want to build this into your pipeline for multiple runs, a second option is to use "dry run" (as above) and see what it does and then use this as a guide to implement the pipelines in your job scheduler.
    The downside to this approach is that as Genome STRiP changes, you'll have a support burden to keep your pipeline up to date.

    A third option is to look into adding your job scheduler to Queue. I haven't looked at the Queue code, but I believe there was some attempt to make the job schedulers pluggable when Queue was extended to handle SGE in addition to LSF.
    It might not be too hard to add your own job scheduler.
    And if you do this, you could ask the GATK group if they would accept a patch so others could use it as well.

    Hi Bob,
    I want to use the GenomeStrip 2 to perform sv and cnv on a population scale of sorghum but my server hasn't LSF, SGE or other queue workflow engine .
    My server parameter as follows:
    Linux localhost.localdomain 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
    Logical CPU Number : 24
    Physical CPU Number : 2
    CPU Core Number : 6
    HT Number : 2
    MemTotal:128G
    Disk:24T

    I don't known how to do with my server then I can run GenomeStrip 2 successfully. I known my server is not suitable for installing LSF and LSF is not free.
    How about install SGE or other queue workflow engine ? I don't known much about it. Can you give me some suggestions? Thank you so much.

  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    I would try to install SGE. I have no experience with doing this, however.

  • xiayanokxiayanok chinaMember

    @bhandsaker said:
    I would try to install SGE. I have no experience with doing this, however.

    Thank you Bob.

Sign In or Register to comment.