The current GATK version is 3.2-2

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

IntervalWalker

Posts: 64Member

Hi,

Recently I've been wanting to preform tasks for each interval in a file using the GATK. Are you guys planning to create an IntervalWalker class? (Or is there a workaround?) DiagnoseTargets in gatk-protected seem to work this way, but not in a very straightforward way - also since it's protected I'm not sure how much if any code I could borrow from there.

cheers Daniel

Tagged:

• Posts: 679GATK Developer mod

Hi Daniel, Walkers do have the ability to "reduce by interval". DepthOfCoverage is an example of a walker that does this in certain situations. Does this meet your needs?

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Posts: 64Member

It seems that it can to what I want. I see that DoC uses onTraversalDone() which seems powerful. The comment in the source code is good:

(pasted here for reference to other readers)

/**
* Return true if your walker wants to reduce each interval separately.  Default is false.
*
* If you set this flag, several things will happen.
*
* The system will invoke reduceInit() once for each interval being processed, starting a fresh reduce
* Reduce will accumulate normally at each map unit in the interval
* However, onTraversalDone(reduce) will be called after each interval is processed.
* The system will call onTraversalDone( GenomeLoc -> reduce ), after all reductions are done,
*   which is overloaded here to call onTraversalDone(reduce) for each location
*
* @return true if your walker wants to reduce each interval separately.
*/
public boolean isReduceByInterval() {
return true; //changes default behavior.
}


One question: Can I get to the interval data in onTraversalDone, for example chromosome and coordinate or do I have to pass those along between the map() calls? I will be looking for info like the reference sequence of the interval too - can that be done using built-in methods or would I have to "build" them from several reduce steps myself?

thanks a lot!

• Posts: 64Member

Yeah, that works! Thanks.