Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Custom Walker That Calls Other Walkers?

bbimberbbimber HomeMember

Hello,

We're trying to make a tool that takes the output of several iterations of VariantEval (each stratified differently), and then uses these data to make another report. It would be very convenient if our Walker could create/call 2 instances of the VariantEval walker internally. Something like:

public class VariantQC extends RodWalker<Integer, Integer> implements TreeReducible {
private VariantEval sampleStratifiedWalker;
private VariantEval locusStratifiedWalker;

@Override
public void initialize() {
    super.initialize();

    //configure the child walkers.  these would need a fixed set of arguments, derived from this tool
    sampleStratifiedWalker = new VariantEval();
    sampleStratifiedWalker.initialize();

    locusStratifiedWalker = new VariantEval();
    locusStratifiedWalker.initialize();
}

@Override
public Integer map(RefMetaDataTracker tracker, ReferenceContext ref, AlignmentContext context) {
    //call the child walkers in step with this one
    sampleStratifiedWalker.map(tracker, ref, context);
    locusStratifiedWalker.map(tracker, ref, context);

    return null;
}

}

I would need to configure the 2 VariantEval instances with appropriate arguments/settings. For example, many of the settings for each VariantEval instance would be static; however, some settings might be a pass-through from the arguments passed to the VariantQC walker. So far as I can tell, arguments are set on walkers through CommandLineExecutable.loadArgumentsIntoObject(). It seems like with some work, perhaps I could use something in there to manually set arguments on my 2 VariantEval instances. The questions I have are:

1) Do any other tools take this kind of approach, i.e. having a Walker that calls other walkers?
2) Is there a better way to configure these child walkers?

Thanks in advance for any help,
Ben

Issue · Github
by Sheila

Issue Number
2034
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
knoblett

Answers

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin
    edited May 2017

    Have you used/heard of WDL? WDL is a pipelining language that allows you to call multiple tools in order or at once (in parallel). I think that may be able to solve your problem, so long as you are just building wrappers around the existing VariantEval tool (and not modifying the behavior of the tool itself).

    For example, let's say you want to run VariantEval with parameters xyz, and then again with abc. Then on those two calls to VariantEval, you want to run your custom tool that generates a report based on the output to the two VariantEval calls. If this sounds like your case, then WDL would be your best bet. If this doesn't sound like your case, could you give me a bit more information on what it is you want to do?

  • bbimberbbimber HomeMember

    yeah, i've heard of WDL, but that would actually initiate n readers that happen to run in parallel, instead of reading the VCF once, right?

    our specific case is making a tool that gathers VariantEval-based data on a VCF (stratified several ways), and takes these results to make a combined report. the advantage of doing is as a walker is that I have direct access to the EvaluationContext objects, rather than running VariantEval and then reading the reports. therefore if something like this walker-calling-walkers approach worked, it would be a pretty clean solution. There may be pitfalls I'm not seeing yet though.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @bbimber, I see what you're trying to do but we don't have any "walkers that call walkers" as such, if I recall correctly.

    @droazen and the GATK4 engine team might have advice on how to engineer something like that in the GATK4 framework (and what is the plan for VariantEval-like functionality there), but be aware they are extremely busy finalizing the upcoming GATK4 beta release and may not have the bandwidth to discuss this in detail.

  • bbimberbbimber HomeMember

    This does appear to work. Setting arguments on the child walkers is a little hacky (i'm living w/ it until migrating to GATK4). I'm happy to share, if anyone is interested. The reason we're not going direct to GATK4 is VariantEval isnt migrated yet.

    I realize this isnt quite how GATK is expecting walkers to work, but it is quite useful for this project.

Sign In or Register to comment.