We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Trouble getting output from a bucket; also finding buckets for TCGA bams
Despite being a developer of a tool that runs on Firecloud all the time, I have never actually used Firecloud (or Firehose), so please do not underestimate my ignorance for the following. . .
We're planning an enormous Mutect validation based on the MC3 workspace: nci-gsaksena-bi-org/MC3_mutation_validator. In this workspace they called over 10,000 pairs with several tools and then validated each call with RNA seq and wgs to produce an enormous maf.
For our validation we want to run Mutect2 on some of these pairs and compare to this enormous validated maf using our own validation wdl. For that we need the bams for the pairs and the maf.
QUESTION #1: When I try to navigate to the enormous validation maf following the link in Firecloud to the google bucket of cromwell outputs I get the error "The account for bucket "fc-e5920d62-ed56-47ed-8a61-bf1cac042c69" has been disabled." How do I download this file?
QUESTION #2: It would be easiest for us to do this analysis outside of Firecloud with our own cromwell setup. Is it possible to get the google bucket paths to the bams for these pairs and then run cromwell pointing to these buckets, or can they only be accessed via Firecloud. That is, from the workspace I can find the sample name AC-1190-TCGA-01D or whatever, but can I translate that to gs://tcga-bucket/AC-1190-TCGA-01D.bam and then use that path, circumventing Firecloud?