FireCloud launched a new beta feature today (3/12): Notebooks! Read this blog post for more information.
To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at
LATEST RELEASE: FireCloud's latest release was on March 12th. Release Notes can be found here.

Data Model

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
edited June 2017 in Dictionary

The data model is a formal description of what types of data entities we are working with and how they relate to each other. The FireCloud data model currently supports the following basic entity types:

coming soon: an illustration of the current data model

You import metadata corresponding to these entities by uploading load files in tab-separated-value format (a type of text file). All of the lines in a load file must reference entities of the same type, and separate files must be used for each entity type. The first line must contain the appropriate field names in their respective column headers. See the individual entity entries for examples of load files.

Note that for each of the basic entities, the data model also supports set entities, which are essentially lists of the basic entity type.

  • Participant Set
  • Sample Set
  • Pair Set

In set load files, each line lists the membership of a non-set entity (e.g., participant) in a set (e.g., participant set). The first column contains the identifier of the set entity and the second column contains a key referencing a member of that set. For example, a load file for a participant set looks like this:

membership:participant_set_id participant_id

Note that multiple rows in a set load file may have the same set entity id (e.g. TCGA_COAD).

Order for uploading Load Files

Load files must be imported in a strict order due to references to other entities.

The order is as follows ("A > B" means entity type A must precede B in order of upload):

  • participants > samples
  • samples > pairs
  • participants > participant sets
  • samples > sample sets
  • pairs > pair sets
  • set membership > set entity, e.g., participants > samples > sample set membership > sample set entity.
Post edited by Geraldine_VdAuwera on
Sign In or Register to comment.