We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Reduce running time for tool RealignerTargetCreator

ywy25ywy25 Member
edited June 2014 in Ask the GATK team

Dear GATK team,

I'm using GATK RealignerTargetCreator and IndelRealigner for a very small region(~100bp) that trimmed from the original whole exome BAM file.
For example, the region I need is chr1:1-150 (only contain one realign target).
I first used samtools to get the BAM for this region.
Then I met a problem while running RealignerTargetCreator with only chr1 (have chr01.dict) as reference file. Please see the following command I use:

java -Xmx1g -jar ~/programs/GenomeAnalysisTK.jar -T RealignerTargetCreator -R chr01.fa -I Trim_test_sort.bam -o realigner.intervals

Here is the error message:
ERROR MESSAGE: Badly formed genome loc: Contig chr2 given as location, but this contig isn't present in the Fasta sequence dictionary

I found this problem could be solved by using whole genome as reference (i.e. hg19.fa). However, it will take a very long time to go through every chromosome (step ProgressMeter), although the BAM file only contain reads located in chr1:1-150.
I also tried to delete some @SQ lines from the trimmed BAM header, but it didn't work.

Just wondering if there is anyway to let RealignerTargetCreator only go through the chr1:1-150 (or just chr1) to save time?

Many thanks!!


Post edited by ywy25 on

Best Answer


Sign In or Register to comment.