Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

Latest Release: 12/4/18
Release Notes can be found here.

How can I find the version of MuTect2 in broadinstitute_cga/MutationCalling_Mutect_v1-2_BETA_cfg?

shleeshlee CambridgeMember, Broadie, Moderator admin

Hi Jason,

I see for broadinstitute_cga/MutationCalling_Mutect_v1-2_BETA_cfg (described here) that MuTect1 uses the v1.1.6 jar but MuTect2 uses /usr/local/bin/GenomeAnalysisTK.jar. Is there a way I can find out what version of GATK MuTect2 this configuration uses without running the workflow? I want to confirm I'm using the version of the jar that recently incorporated the strand bias filter. Thanks.

Best Answers

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Oy, yes, not recent enough. GATK is on v3.7. Thanks for that @esalinas.

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Thanks Eddy for the follow-up.

    How can I use the latest v3.7 GATK on FireCloud? I want to realign some of the paired open access data in FireCloud to GRCh38, joint preprocess and then analyze using both MuTect1 and MuTect2. I know GDC data is aligned to GRCh38. Does FireCloud offer an equivalent dataset?

    I am a naive user so thanks for being patient. I see the broadgdac Docker repo has many different images and it's unclear to me which containers contain the versions of tools I need and whether by changing a WDL script's runtime attribute, e.g.

        runtime {
            docker : "broadgdac/report_correlate_genomic_event_clinical:39"
    

    I can actually change the container a script uses in FireCloud.

    Alternatively, I can do a pre-analysis with data aligned to GRCh37, as well as a prior release of GATK. To reach my aims, I would then have to download data from regions I identify as interesting in the pre-analysis and redo the workflow with GRCh38 and the updated GATK on my laptop. This works for me because all that I'm doing is writing a tutorial that uses small amounts of data. But I am wondering if there is a simpler way for me to get to where I want to go. Thanks.

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    The short answer is that you find the JAR you want (or download the GATK repo/source and build it) and then once its built, load it into a docker image and then push it to docker hub and then either : 1) make it public or 2) have it private with "firecloud" as a "collaborator". Once that is done you can build a WDL that uses that docker image for commands.

    It is my understanding that some of the newer GDC data is indeed aligned to GRCh38. I'm not aware of any FC workspace that has that data though. If in fact there's no such workspace (WS) with that data, then you might have to create a WS and import it into FC with load files and possibly copying it to buckets. @birger (Chet Birger) has a "just-in-time" file creation tool whose WDL might be applicable to your use case.

    With regard to FC, you might want to take a look at the documentation for workspace creation and data importation https://docs.google.com/document/d/1X7q4zYAb16Py8raxGhP_HPzp5KRjrNfTeSR0wIRrzQU/edit . Also there is a "Tool Developers Tutorial" and slides which have some relevance for docker image creation and pushing. https://docs.google.com/document/d/1SExJeNoxKGBClPYKtGFfQ0K0bGN2nQeuh_iIkxEt3TY/edit

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Any estimate on how long a road I have to travel to create my own Docker container?

    I'm on a tight time schedule because we want this tutorial ready in less than a month for the next GATK workshop. Because for someone with my background (i) the learning curve for creating my own private functional Docker image seems steep (I was hoping FireCloud would remove this bottleneck for me), (ii) the time commitment to the learning is unclear and (iii) I need to leave plenty of time for WDL scripting, I think a safer bet for me is to run analyses using approaches I already know.

    Do you recommend differently?

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Update I've plunged ahead and it's taken me ~ a day to use an existing docker container, add the latest picard and gatk jars to it and then commit and push this to my own private Docker repo. I've set firecloud as my collaborator and will now turn to writing WDL scripts.

  • esalinasesalinas BroadMember, Broadie ✭✭✭

    Hi soohee,

    Good! Consider using a recent version of cromwell to run the WDL locally before running it on FC.

    -eddie

Sign In or Register to comment.