We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Non-model organism sub-forum?

I was wondering if perhaps there should be a sub-forum or some other such area dedicated to work on non-model organism?

Those of us working on non-model organisms (or even just non-human/mouse) face quite a few issues which folks working on humans don't. There are also a lot of (sophisticated) workflows, optimizations, and even terms which are used with human data which just don't apply to other organisms (except maybe mouse models).
It really boils down to: "Do you have a very good reference sequence and lots of annotations, or not".

Some of the topics would include:

  • Pretty much everything when your population has a lot of variation (relative to humans at least)
  • Variant calling with different ploidy (thankfully HC can now do)
  • Reference-free methods
  • What I'll call 'reference-lite' methods, where the reference is a guide but not really trusted
  • Detecting and dealing with reference errors (gaps, chimeras, misassembly, ect.)
  • Detecting big structural variants (eventually GenomeSTRiP should help)

Anyways, just an idea. I'm curious if there is enough interest in expanding the GATK toolset and best-practices to be more useful for non-model organisms.



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    This is actually something we've been thinking about setting up for a while. There's clearly a lot of interest from the non-human community (so to speak). We're just not sure what would be the best way for us to catalyze this as it doesn't completely fit our current support model, and would need to be more community-driven. We're more than happy to provide a space for it to happen, but we lack the expertise to propose or even vet the topics/solutions that will be discussed. I think what we really need (to make it useful and reliable as an informational resource) is to have a few motivated individuals who can act as domain experts to help nucleate, curate the material and guide discussions per organism (or class of organism).

    Volunteers welcome :)

  • ryanabashbashryanabashbash Oak Ridge National LaboratoryMember

    While it's not a substitute for a community forum, our group recently published the processing logic we use with our non-model organism data and the GATK (http://www.g3journal.org/content/early/2015/02/13/g3.115.017012.abstract). We regularly use it for an agricultural organism with good results, and it worked well when benchmarked with some Arabidopsis data. We're hopeful that it will promote others to take advantage of the GATK's benefits.

  • allisonfpallisonfp UCLAMember

    Speaking of non-model organisms, I'm trying to figure something out and can't find a clear answer. If I am working on a species with no reference genome, is GATK totally unusable?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @allisonfp I'm afraid so -- GATK requires a reference to operate. You could consider generating one through de novo assembly, but this is not trivial.

Sign In or Register to comment.