Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.

task config name mangling in monitor tab

gordon123gordon123 BroadMember, Broadie

When I launch a task config, its name shows up in the monitor tab. If I later modify the task config (eg update the wdl snapshot or edit the fields), all previously launched task configs now show up with a string of random letters appended.

While it may make sense to be able to distinguish between versions of the task configs that have been launched, it would be better to use a more human readable name. One suggestion would be "_Vx.y", applied to all config launches rather than only after a change. x = the wdl snapshot, y = the sequential version number of the edit to task config fields.

Answers

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Do you have any screenshots of the string of random letters being appended to the task configs? I will put in a feature request for you on this matter.

  • gordon123gordon123 BroadMember, Broadie

    In the attached screenshot, these show up both with and without the added characters.

    broadgdac/preprocess_methylation__HM450
    broadgdac/CopyNumber_Gistic2

  • dheimandheiman Member, Broadie ✭✭

    Another suggestion would be to simply add a version/snapshot column. What might even be more preferable is only the latest instance appear, and then have a history tab or some-such from within the workflow monitor view.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi all, FYI the product team is currently working on a pretty extensive redesign of how we organize and display method configs. One of the key features we want from the redesign is to be able to group snapshots into a single entry with an option to select a specific one (eg through a dropdown menu), as well as the ability to distinguish snapshots from "blessed" versions. It'll be a while until we're able to deliver those changes but it should address the core issues with method config related UX.

  • mnoblemnoble Broad Institute of MIT & HarvardMember, Broadie

    I'm late to this party, but would like to emphasize that the impact here is not limited to human readability in the UI. It also impacts our ability to collect runtime pass/fail/wait stats and display them correctly in green/red/yellow dashboards that summarize our production runs (e.g. as given here).

    For example, consider running v2 of some_tool on 35 cohorts in a space, and assume it succeeds on 30 of them but fails for 5; so we then fix the cause of the failures in the code (or WDL, etc) and install v3. At this point, following the Principle of Least Surprise, we want 2 things:

    a) to NOT re-run some_tool for the 30 cohorts that succeeded, but rather only for the 5 which previously failed
    b) while having the summary dashboard show success (green) for all 35 cohorts

    Part (b) here is effectively impossible due to name-mangling: because the name of the v2 version is mangled, when we collect stats for some_tool across all 35 cohorts then 30 of them will report yellow (the "wait" state, meaning "it was not run") and only 5 will report green (the "success" state, meaning it successfully completed).

    However, I think our dashboard problem should not be hard to solve within the internals of FireCloud: because the list submissions API should in principle be returning records from some table in which the outcome of each submission is recorded. But, once records are written to that outcomes table why would those records need to be changed after a new method is installed to a workspace? Shouldn't those run outcome records be considered read-only after they're created, at least as long as the space exists? So, if the primary key of records in this table is method_config_name and another field in the record is method_config_version then in principle a call like

    list_submissions(method_name="some_tool")
    

    should be able to return 35 records from this internal results table, all of which (a) have the same name and (b) ran successfully to completion; but 30 of which have method_config_version=2 and 5 of which have method_config_version=3 ... and then the user (on the client end) can decide what they want to do with such results.

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    So we did implement a big Methods overhaul in the UI back in August of last year. I don't see any name mangling like was originally mentioned in this thread (some_tool appended with a random series of numbers and letters) in the UI itself, and unfortunately I don't access the API end of the platform enough to know if that is still the case underneath the UI.

    I will certainly check with a developer to see if it is still done that way. However, it seems like your primary request is that you want the Monitor page to display green (success) for the 30 previously-run-and-succeeded cohorts. Unfortunately in the situation you outlined, you wouldn't be able to call-cache the 30 previous results if you'd altered the command section of your Method. So, you'd have to run on just the 5 results that failed, meaning that 30 green and 5 red would be in the first run, and just 5 green would be in the second run, with no indication of status on the original 30.

    Through the API, we do have a "list submissions" option, but it only allows you to query by workspace, not by method tool name. Would implementing an API that allowed you to query by workspace and method tool name satisfy your request?

Sign In or Register to comment.