Data Model

The data model is a formal description of what types of data entities we are working with and how they relate to each other. The FireCloud data model currently supports the following basic entity types:

You import metadata corresponding to these entities by uploading load files in tab-separated-value format (a type of text file). All of the lines in a load file must reference entities of the same type, and separate files must be used for each entity type. The first line must contain the appropriate field names in their respective column headers. See the individual entity entries for examples of load files.

Note that for each of the basic entities, the data model also supports set entities, which are essentially lists of the basic entity type.

  • Participant Set
  • Sample Set
  • Pair Set

In set load files, each line lists the membership of a non-set entity (e.g., participant) in a set (e.g., participant set). The first column contains the identifier of the set entity and the second column contains a key referencing a member of that set. For example, a load file for a participant set looks like this:

membership:participant_set_id participant_id

Note that multiple rows in a set load file may have the same set entity id (e.g. TCGA_COAD).

Order for uploading Load Files

Load files must be imported in a strict order due to references to other entities.

The order is as follows ("A > B" means entity type A must precede B in order of upload):

  • participants > samples
  • samples > pairs
  • participants > participant sets
  • samples > sample sets
  • pairs > pair sets
  • set membership > set entity, e.g., participants > samples > sample set membership > sample set entity.
