Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Few questions on RealignerTargetCreator

praveenrajspraveenrajs Posts: 5Member
edited December 2012 in Ask the team

I've the following queries on running RealignerTargetCreator module in GATK1.4.

1) Is it recommended to provide the target capture BED file to RealignerTargetCreator in case of targeted/exome experiments? Without the bed file, the tool is taking long time (~6-7 hrs). What's the optimal way here?

2) Does running mark duplicates before or after 'RealignerTargetCreator' have any effect on the # of snps/indels? What is recommended?

Look forward to your comments. Raj

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,263Administrator, GSA Member admin

    Hi Raj,

    1) Yes, it is good to use an interval file to perform realignment on targeted/exome data.

    2) Ideally you would run Mark Duplicates before and after realignment; but in practice it's enough to just mark duplicates once, before realignment. The effect (of not doing it a second time) on variant calls should be marginal and can be ignored.

    I would however recommend that you upgrade to a more recent version of GATK, preferably the latest. You will get much better results (particularly for indels) and less chance of running into bugs.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.