We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Recalibration with non-model organism

KerensaKerensa CanberraMember

Hi there,

I regularly use GATK for non-model organism analysis, and have been thinking a bit about recalibration (so far I have not tried this).

There seem to be two steps in the workflow that benefit - base quality score recalibration, and variant quality recalibration. There also appear to be two methods of generating 'true' variant sets for non-model organisms : (1) an iterative approach, where you run through GATK and take the initial variant set (based on hard filters) and use this for recalibration; and (2) one could manually curate a small section of the genome for high quality variants.

I expect approach (1) would have good sensitivity, while approach two would have good specificity. What is more important (sensitivity or specificity) for the recalibration? Is this the same for both types of recalibration?

Thanks in advance for your insights,


Best Answer


  • ryanabashbashryanabashbash Oak Ridge National LaboratoryMember


    Our group uses the GATK in a fashion similar to the 1st method you mention with plants. Similar to Geraldine's statement, we stringently hard filter whole-genome or reduced representation sequence data (preferably from samples different than those you are running BQSR and VQSR on) to make a set of specific variants to use for truth/training in VQSR. Given the VQSR results, we take variants from a sensitive tranche and use it for BQSR of the reads (followed by another round of variant calling and VQSR). If you're interested, the details can be found with our paper here: http://www.g3journal.org/content/5/4/655.abstract .

Sign In or Register to comment.