We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

How bad will happen if I didn't feed known sites for IndelRealigner?

skblazerskblazer Member
edited August 2012 in Ask the GATK team

Hi,

I think feeding known sites is just optional for the previous version of TargetCreator and IndelRealigner. But I don't know how strictly the local realignment step of GATK 2.0 require known sites. If things will dramatically go wrong due to lacking of known sites?

As you said below, "In the variant calling pipeline, the only tools that do not strictly require known sites are UnifiedGenotyper and HaplotypeCaller."
http://gatkforums.broadinstitute.org/discussion/1247/what-should-i-use-as-known-variantssites-for-running-tool-x#latest

If there is also a statistical model only works with known sites for the local alignment step like quality recalibration?

Best,
SK

Best Answer

  • Mark_DePristoMark_DePristo Broad Institute admin
    Accepted Answer

    My view is that, if you are working with human data, that you can simply realign your lane-level BAM files to the list of known sites, and not even worry about doing the full realignment with discovery of sites, if you are going to use the haplotype caller for calling. If you don't have a good set of known indel sites though the full per sample realignment is the only way to go. So basically I think your question is backward -- the known sites is the primary input, and it shouldn't be left out unless you don't have a catalog of indels

Answers

  • Mark_DePristoMark_DePristo Broad InstituteMember admin
    Accepted Answer

    My view is that, if you are working with human data, that you can simply realign your lane-level BAM files to the list of known sites, and not even worry about doing the full realignment with discovery of sites, if you are going to use the haplotype caller for calling. If you don't have a good set of known indel sites though the full per sample realignment is the only way to go. So basically I think your question is backward -- the known sites is the primary input, and it shouldn't be left out unless you don't have a catalog of indels

  • skblazerskblazer Member

    Thanks, Mark.

Sign In or Register to comment.