Writing custom tools: htsjdk vs picard vs GATK
Hi fellow htsjdk/picard/gatk developers!
I've been thinking about this for quite some time now, so I thought I should write up a quick post about it here.
I've been writing custom tools for our group using both picard and GATK for some time now. It's been working nicely, but I have been missing a set of basic tutorials and examples, for users to quickly get started writing walkers. My most commonly used reference has been the 20-line life savers (http://www.slideshare.net/danbolser/20line-lifesavers-coding-simple-solutions-in-the-gatk) which is getting a bit dated.
What I would like to see is something like for following:
- What's in htsjsk? What's not in htsjdk? (from a dev's perspective - in terms of frameworks)
- What's in picard? What's not in picard? (from a dev's perspective - in terms of frameworks)
- What's in gatk? What's not in gatk? (from a dev's perspective - in terms of frameworks)
- When to use htsjdk, picard any GATK. What are the strengths and weaknesses of the three. (possibly more that I've missed)
- Your first htsjdk walker
- Your first picard walker
- Your first gatk walker
- Traversing a BAM in htsjdk vs gatk - what are the differences
There might be more stuff that could go in here as well. The driving force behind this is that I'm myself a bit confused by the overlap of these three packages/frameworks. I do understand that picard uses htsjdk, and that GATK uses both as dependencies, but it's not super clear what extra functionality (for a developer) is added from htsjdk -> picard -> gatk.
Could we assemble a small group of interested developers to contribute to this? We could set up a git repo with the examples and tutorials for easy collaboration and sharing online.
Anyone interested? I'll could myself as the first member