Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Reduce running time for tool RealignerTargetCreator

ywy25ywy25 Member
edited June 2014 in Ask the GATK team

Dear GATK team,

I'm using GATK RealignerTargetCreator and IndelRealigner for a very small region(~100bp) that trimmed from the original whole exome BAM file.
For example, the region I need is chr1:1-150 (only contain one realign target).
I first used samtools to get the BAM for this region.
Then I met a problem while running RealignerTargetCreator with only chr1 (have chr01.dict) as reference file. Please see the following command I use:

java -Xmx1g -jar ~/programs/GenomeAnalysisTK.jar -T RealignerTargetCreator -R chr01.fa -I Trim_test_sort.bam -o realigner.intervals

Here is the error message:
ERROR MESSAGE: Badly formed genome loc: Contig chr2 given as location, but this contig isn't present in the Fasta sequence dictionary

I found this problem could be solved by using whole genome as reference (i.e. hg19.fa). However, it will take a very long time to go through every chromosome (step ProgressMeter), although the BAM file only contain reads located in chr1:1-150.
I also tried to delete some @SQ lines from the trimmed BAM header, but it didn't work.

Just wondering if there is anyway to let RealignerTargetCreator only go through the chr1:1-150 (or just chr1) to save time?

Many thanks!!

Shan

Post edited by ywy25 on

Best Answer

Answers

Sign In or Register to comment.