Service note: Geraldine is on vacation this week; other members of GSA will be responding to questions, but they have a lot of work besides this, so be aware that responses may be a little slower than usual. Thank you for your patience.

Best approach for realignertargetcreator and indelrealigner

steve1980steve1980 Posts: 1Member

Hi,

I am trying to decide between two approaches for performing realignment around indels. I have ~600 samples that have been aligned to a very fragmented draft genome assembly. What is best:
1. take each sample and create a list of targets, followed by realignment on each sample.
2. combine all samples into one large bam file and create a list of targets, followed by realignment on the same large bam file.

Also, would there be any advantages in terms of speed with either approach?

Cheers,

Steve

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 2,239Administrator, GSA Official Member admin

    Hi Steve,

    Actually you don't need to combine the samples in a single BAM file to process them together, you can just pass them all as inputs in a list file.

    Realigning them all together is best because then the realignment will be consistent over all of them. That said it can be a lengthy process with a lot of samples, so if you find performance is an issue you can do the target creation on the full list, then realign in batches -- as long as you use the same target intervals file that is completely fine. You can also look into multithreading to speed things up.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.