how to launch method on all samples in the workspace?

bhaasbhaas Broad InstituteMember, Broadie

When I go to launch a method on a workspace, the GUI allows me to select which samples to run it on, but the list of available entries shown is a small subset of the total available in my workspace.

Is there a way to select all the samples at once?

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MA admin
    Accepted Answer

    Hi @bhaas, we're still working on generating proper docs to that effect, but here are the key points:

    Defining your sample set

    You define your sample set in the Data tab by importing what is essentially a table of samples and the sample set they belong to. For an example of what that looks like, see this public workspace's Data tab: https://portal.firecloud.org/#workspaces/help-firecloud/FireCloud101-Basics/data

    image

    If you click on "Download 'sample_set' metadata", you'll get a zip archive containing two files: sample_set_entity.tsv and sample_set_membership.tsv. Disregard the former; you'll see the latter describes the set of samples by listing, on each line, a sample set ID and a sample that belongs to it. It looks like this:

    membership:sample_set_id sample
    CEUTrio_wgs_20 NA12878_wgs_20
    CEUTrio_wgs_20 NA12877_wgs_20
    CEUTrio_wgs_20 NA12882_wgs_20

    All you need to do to define your sample set is modify this file (or generate one like it) with your sample set and sample IDs. The sample set ID can be any arbitrary name; the sample IDs must be IDs of samples you have already imported into the workspace. You can define multiple sample sets within the same file, and a sample can belong to multiple sample sets, so you can do this for example:

    membership:sample_set_id sample
    CEUTrio_wgs_20 NA12878_wgs_20
    CEUTrio_wgs_20 NA12877_wgs_20
    CEUTrio_wgs_20 NA12882_wgs_20
    CEUTrio_test NA12877_wgs_20
    CEUTrio_test NA12882_wgs_20

    Once you've made your TSV file describing your sample set(s), you import it by clicking the "Import Metadata..." button (still in your workspace's Data tab). This opens a dialog; follow the instructions to select the TSV file you created or modified, and assuming you don't hit any errors, once you close the dialog you'll see there is now a "sample_set" tab next to "participant" and "sample". If you click on it you can verify that your sample set has been created correctly.

    Running a method on a sample set

    To run a method on your newly created sample set, you don't need to change your method configuration. When after clicking "Launch Analysis...", the dialog opens on the sample list, you need to switch to the sample_set list that should now appear, and select your sample set.

    At this point, the trick is that you can't just hit "Launch"; first you need to define an expression to tell FireCloud how to deal with the fact that instead of the single sample it's expecting based on the method config, you're giving it a list of samples. In this case, the expression is this.samples. Then you can hit "Launch".

    Note the plural in the expression; if you leave out the s it won't work. Yes, it's annoying, and no it's not well documented yet... this is something we're working on improving right now.

    That should be all you need to do; let me know if you experience any issues.

Answers

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    In order to select all the samples at once, you will need to create a sample_set which contains all the samples in your workspace.

    Selecting multiple/all of the samples in your workspace when trying to run (on-the-fly set creation) is a feature we want to implement in the future, but unfortunately it is not yet available.

  • bhaasbhaas Broad InstituteMember, Broadie

    Thanks, Kate! Is there documentation you can point me to that provides a sample-set guided application execution? Also, does this mean that I need to reconfigure a method to use a sample-set as input instead of a sample?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Accepted Answer

    Hi @bhaas, we're still working on generating proper docs to that effect, but here are the key points:

    Defining your sample set

    You define your sample set in the Data tab by importing what is essentially a table of samples and the sample set they belong to. For an example of what that looks like, see this public workspace's Data tab: https://portal.firecloud.org/#workspaces/help-firecloud/FireCloud101-Basics/data

    image

    If you click on "Download 'sample_set' metadata", you'll get a zip archive containing two files: sample_set_entity.tsv and sample_set_membership.tsv. Disregard the former; you'll see the latter describes the set of samples by listing, on each line, a sample set ID and a sample that belongs to it. It looks like this:

    membership:sample_set_id sample
    CEUTrio_wgs_20 NA12878_wgs_20
    CEUTrio_wgs_20 NA12877_wgs_20
    CEUTrio_wgs_20 NA12882_wgs_20

    All you need to do to define your sample set is modify this file (or generate one like it) with your sample set and sample IDs. The sample set ID can be any arbitrary name; the sample IDs must be IDs of samples you have already imported into the workspace. You can define multiple sample sets within the same file, and a sample can belong to multiple sample sets, so you can do this for example:

    membership:sample_set_id sample
    CEUTrio_wgs_20 NA12878_wgs_20
    CEUTrio_wgs_20 NA12877_wgs_20
    CEUTrio_wgs_20 NA12882_wgs_20
    CEUTrio_test NA12877_wgs_20
    CEUTrio_test NA12882_wgs_20

    Once you've made your TSV file describing your sample set(s), you import it by clicking the "Import Metadata..." button (still in your workspace's Data tab). This opens a dialog; follow the instructions to select the TSV file you created or modified, and assuming you don't hit any errors, once you close the dialog you'll see there is now a "sample_set" tab next to "participant" and "sample". If you click on it you can verify that your sample set has been created correctly.

    Running a method on a sample set

    To run a method on your newly created sample set, you don't need to change your method configuration. When after clicking "Launch Analysis...", the dialog opens on the sample list, you need to switch to the sample_set list that should now appear, and select your sample set.

    At this point, the trick is that you can't just hit "Launch"; first you need to define an expression to tell FireCloud how to deal with the fact that instead of the single sample it's expecting based on the method config, you're giving it a list of samples. In this case, the expression is this.samples. Then you can hit "Launch".

    Note the plural in the expression; if you leave out the s it won't work. Yes, it's annoying, and no it's not well documented yet... this is something we're working on improving right now.

    That should be all you need to do; let me know if you experience any issues.

    Issue · Github
    by Geraldine_VdAuwera

    Issue Number
    2371
    State
    open
    Last Updated
    Assignee
    Array
    Milestone
    Array
  • bhaasbhaas Broad InstituteMember, Broadie

    Great! I'll give this a shot. thanks!

Sign In or Register to comment.