Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Few questions on RealignerTargetCreator

praveenrajspraveenrajs Member
edited December 2012 in Ask the GATK team

I've the following queries on running RealignerTargetCreator module in GATK1.4.

1) Is it recommended to provide the target capture BED file to RealignerTargetCreator in case of targeted/exome experiments? Without the bed file, the tool is taking long time (~6-7 hrs). What's the optimal way here?

2) Does running mark duplicates before or after 'RealignerTargetCreator' have any effect on the # of snps/indels? What is recommended?

Look forward to your comments.
Raj

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Raj,

    1) Yes, it is good to use an interval file to perform realignment on targeted/exome data.

    2) Ideally you would run Mark Duplicates before and after realignment; but in practice it's enough to just mark duplicates once, before realignment. The effect (of not doing it a second time) on variant calls should be marginal and can be ignored.

    I would however recommend that you upgrade to a more recent version of GATK, preferably the latest. You will get much better results (particularly for indels) and less chance of running into bugs.

Sign In or Register to comment.