A workspace is a computational sandbox where you can organize genomic data and tools, and run analyses. Users can create, share, and clone workspaces.
- Data: pre-loaded or user-uploaded, open access or controlled access
- Workflows: pre-loaded or user-created
- Tools: pre-loaded or user-created
- Results: from all runs, captured with provenance
Workspaces contain a data model to organize data and metadata, and simplify analysis runs for large data sets. The data model includes predefined entity types (e.g., participant and sample set), relationships, and attributes. For your convenience, results from analyses are populated directly to the data model. Currently, the data model is tailored to TCGA data, but will be extensible to non-TCGA projects with a germline or cell-line focus.
The data model includes entities and entity attributes. Entities refer to a physical thing (e.g., a participant) or a collection of physical things (e.g., participant sets). FireCloud uses entities to provide organization and hierarchical structure for data. For example, a participant entity refers to a participant; a sample entity refers to a sample from that participant.
Meanwhile, entity attributes are used to describe entities and associate data to entities. An entity attribute can include values (e.g., numbers or strings) and file paths to data (e.g., the URL of a Google bucket). For example, a participant (entity) can have an age (entity attribute). A sample (entity) can also have an associated BAM that resides in a Google bucket.
Entity attributes can serve as inputs and outputs to methods. For example, a sample (entity) can reference a BAM file path (entity attribute) that serves as the input to a method. This method can in turn generate outputs that populate new entity attributes as results.