We've moved!
You can find our new documentation site and support forum for posting questions here.

How do I access TCGA Data?

jneffjneff BostonMember, Broadie admin
edited December 2017 in Archive

TCGA Data and dbGaP Authorization

  1. Can I access TCGA data in FireCloud?
    TCGA open access data is available to all FireCloud users. Open access data will be found in pre-loaded workspaces. TCGA controlled access data is accessible to users who have dbGaP authorization to use controlled access data.

  2. How do I access controlled access TCGA data in FireCloud?
    To access workspaces in FireCloud containing controlled access data, you must have an eRA Commons or NIH account with dbGaP authorization; and link your FireCloud account to that eRA Commons or NIH account.

  3. How do I gain dbGaP authorization to access controlled access data?
    Information about applying for dbGaP authorization can be found on the dbGaP website or on this NCI wiki page. If you are not a principal investigator (PI), your PI may need to apply on your behalf.

  4. What data can I put in FireCloud from a regulatory point-of-view?
    If the Data Use Agreement (DUA) for your data set explicitly states that the data may be used on a public cloud computing environment, you may use it on FireCloud. FireCloud requires users to abide by all DUAs. It is the responsibility of the users to ensure that all data is used in compliance with the associated DUA.

  5. Is FireCloud secure?
    Yes. FireCloud has been developed in accordance with security guidelines for a Federal Information Security Management Act (FISMA) Moderate System (http://csrc.nist.gov/groups/SMA/fisma/).
    Secure Sockets Layer (SSL) connections are employed for web browsers and system APIs. Data are encrypted at rest by Google Cloud Storage.
    A separate system security plan (SSP) will govern Google Cloud Infrastructure development in adherence to Federal Risk and Authorization Management Program (FedRAMP) guidelines (https://www.fedramp.gov/).

Post edited by Tiffany_at_Broad on


  • "How do I access controlled access TCGA data in FireCloud?"

    Can you expand on this.

    What is the mechanism that a docker image running on firecloud accesses a TCGA bam file , for instance.

  • birgerbirger Member, Broadie, CGA-mod ✭✭✭

    The docker image is run under the credentials of the user that launched the workflow. Before the command line, specified in the WDL file, is run, JES localizes the input files, i.e., it retrieves the specified files from their respective buckets, and creates local copies of them in the container's directory structure. This is done through a gsutil cp command; the request to Google Cloud storage to retrieve the file is authenticated and authorized. Authentication is done through the presentation of an access token, in the name of the firecloud user. Authorization is done through access control lists maintained on the source bucket. Those access control lists ensure that only firecloud users who have demonstrated to FireCloud that they are dbGaP authorized for TCGA data will have read access to the files.

  • smulanesmulane broad InstituteMember, Broadie

    Hello, Are the TCGA BAMs coming directly from the GDC? What genome build are these aligned to?

Sign In or Register to comment.