The current GATK version is 3.3-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Service Notice: Normal service will resume Thursday 28 Jan. Thanks for your patience.

# Input files reads and reference have incompatible contigs while Running GATK:RealignerTargetCreator

SwitzerlandPosts: 11Member
edited April 2014

Hello All,

I am running RealignerTargetCreator using GATK version GenomeAnalysisTK-1.2-4-gd9ea764 and I am getting the following error:


##### ERROR reads contigs = [scaffold1_size320545, scaffold2_size291774, scaffold3_size284740..........

I already checked that I am using the right Reference FASTA file and the correct .bam file, that I have used for alignment before. Therefore, I am clueless why I am getting this error?
I would appreciate your help regarding this problem. Any suggestion is welcome?

Thanks,
Namrata

Post edited by sarkar on
Tagged:

@sarkar Hi Namrata,

First, you should try using our newest GATK version 3.1. Version 1.2 is very old!

Secondly, assuming the number for the reference is correct (based on contig name), you should re-index the bam file. Sometimes the index file gets corrupted, so re-creating a fresh index file fixes the issue.

Let me know if this helps.

• SwitzerlandPosts: 11Member

Hi Sheila,

Thanks for your feedback. I created a new Index file and tried again but still it is giving me the same error.

Do you have any other suggestions?

Many Thanks,
Namrata

Hi Namrata @sarkar

You can try to check the length of the contig in the reference genome and the length of the contig in the bam file to see if they match.

You can do this by using samtools view -h yourfile.bam You will need to use something like grep to get the actual line of the sequence dictionary that corresponds to the contig. If you do not know how to do this, please ask your IT department for help.

Which version are you using now? 3.1?

-Sheila

• SwitzerlandPosts: 11Member

Hi Sheila,

The length of the contig in the reference genome is 1758. And when by doing
samtools view -h myfile.bam | grep scaffold69676_size1796 i got @SQ SN:scaffold69676_size1796 LN:3149. These lengths are consistant with the Error message I got previously.

I tried with a later version GATK 2.6 but it still gave the same error.
I tried to reorderSam but was not of any help.
Thanks,
Namrata

edited April 2014

@sarkar

I see that the length of the reference contig is different from the length of the input file contig. Unfortunately, this means you probably used the wrong version of the reference for aligning your reads. The best advice I can give you is to redo everything starting at the alignment step.

Version 2.6 is better than 1.2, but while you are at it, you should upgrade to version 3.1.

Good luck!

Post edited by Sheila on
• SwitzerlandPosts: 11Member

Hi Sheila,

Thanks very much.

Cheers,
Namrata