If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Reduce running time for tool RealignerTargetCreator
Dear GATK team,
I'm using GATK RealignerTargetCreator and IndelRealigner for a very small region(~100bp) that trimmed from the original whole exome BAM file.
For example, the region I need is chr1:1-150 (only contain one realign target).
I first used samtools to get the BAM for this region.
Then I met a problem while running RealignerTargetCreator with only chr1 (have chr01.dict) as reference file. Please see the following command I use:
java -Xmx1g -jar ~/programs/GenomeAnalysisTK.jar -T RealignerTargetCreator -R chr01.fa -I Trim_test_sort.bam -o realigner.intervals
Here is the error message:
ERROR MESSAGE: Badly formed genome loc: Contig chr2 given as location, but this contig isn't present in the Fasta sequence dictionary
I found this problem could be solved by using whole genome as reference (i.e. hg19.fa). However, it will take a very long time to go through every chromosome (step ProgressMeter), although the BAM file only contain reads located in chr1:1-150.
I also tried to delete some @SQ lines from the trimmed BAM header, but it didn't work.
Just wondering if there is anyway to let RealignerTargetCreator only go through the chr1:1-150 (or just chr1) to save time?