Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.



Recently I've been wanting to preform tasks for each interval in a file using the GATK. Are you guys planning to create an IntervalWalker class? (Or is there a workaround?) DiagnoseTargets in gatk-protected seem to work this way, but not in a very straightforward way - also since it's protected I'm not sure how much if any code I could borrow from there.

cheers Daniel

Best Answer


  • ebanksebanks Posts: 678GATK Developer mod

    Hi Daniel, Walkers do have the ability to "reduce by interval". DepthOfCoverage is an example of a walker that does this in certain situations. Does this meet your needs?

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • dklevebringdklevebring Posts: 57Member

    It seems that it can to what I want. I see that DoC uses onTraversalDone() which seems powerful. The comment in the source code is good:

    (pasted here for reference to other readers)

     * Return true if your walker wants to reduce each interval separately.  Default is false.
     * If you set this flag, several things will happen.
     * The system will invoke reduceInit() once for each interval being processed, starting a fresh reduce
     * Reduce will accumulate normally at each map unit in the interval
     * However, onTraversalDone(reduce) will be called after each interval is processed.
     * The system will call onTraversalDone( GenomeLoc -> reduce ), after all reductions are done,
     *   which is overloaded here to call onTraversalDone(reduce) for each location
     * @return true if your walker wants to reduce each interval separately.
    public boolean isReduceByInterval() {
        return true; //changes default behavior.

    One question: Can I get to the interval data in onTraversalDone, for example chromosome and coordinate or do I have to pass those along between the map() calls? I will be looking for info like the reference sequence of the interval too - can that be done using built-in methods or would I have to "build" them from several reduce steps myself?

    thanks a lot!

Sign In or Register to comment.