The current GATK version is 3.3-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# Unknown reference error using FastaAlternateReferenceMaker

Posts: 8Member
edited September 2013

Hello

I tried to run FastaAlternateReferenceMaker and I get the following error:

WARNING 2013-09-18 16:28:28 IntervalList    Ignoring interval for unknown reference: Chr1:3580210-3580286


For all the intervals I submitted. I already looked around on the web, and I did not find any answer, knowing that my chromosome names are all with the 'Chr' format in all the files and that my interval files are tab delimited.

My interval file look like:

@HD     VN:1.4  SO:unsorted
@SQ     SN:Chr1 LN:158337067    UR:file:chromosome_3.1.fasta    M5:0631b350aa263a0f714de8ba9d609eb0
@SQ     SN:Chr2 LN:137060424    UR:file:Chromosome_3.1.fasta    M5:15898469d6142f8bb74f769bfe9b155f
@SQ     SN:Chr3 LN:121430405    UR:file:Chromosome_3.1.fasta    M5:c515c4da7c2cd2d24c9487db8f733cfd
...
Chr1    3580210 3580286 +       ID=MI0011294_1;accession_number=MI0011294
Chr1    3580220 3580240 +       ID=MIMAT0011792_1;accession_number=MIMAT0011792
Chr1    3607747 3607842 -       ID=MI0014499_1;accession_number=MI0014499
Chr1    3607802 3607822 -       ID=MIMAT0017395_1;accession_number=MIMAT0017395
Chr1    10227277        10227339        -       ID=MI0009752_1;accession_number=MI0009752
Chr1    10227315        10227337        -       ID=MIMAT0009241_1;accession_number=MIMAT0009241
Chr1    19881347        19881431        -       ID=MI0005457_1;accession_number=MI0005457
Chr1    19881398        19881419        -       ID=MIMAT0003539_1;accession_number=MIMAT0003539
Chr1    19930459        19930542        -       ID=MI0005454_1;accession_number=MI0005454
Chr1    19930511        19930532        -       ID=MIMAT0004332_1;accession_number=MIMAT0004332
...


The header of my interval file is a copy of the Chromosome_3.1.dict
I do not know what is misformated and why I get this error

Thanks

Martin

Post edited by Geraldine_VdAuwera on
Tagged:

• Posts: 8Member

Here is, more clearly, what the interval file look like:

@HD VN:1.4 SO:unsorted

@SQ SN:Chr1 LN:158337067 UR:file:chromosome_3.1.fasta M5:0631b350aa263a0f714de8ba9d609eb0\n

@SQ SN:Chr2 LN:137060424 UR:file:Chromosome_3.1.fasta M5:15898469d6142f8bb74f769bfe9b155f

@SQ SN:Chr3 LN:121430405 UR:file:Chromosome_3.1.fasta M5:c515c4da7c2cd2d24c9487db8f733cfd

...

Chr1 3580210 3580286 + ID=MI0011294_1;accession_number=MI0011294

Chr1 3580220 3580240 + ID=MIMAT0011792_1;accession_number=MIMAT0011792

Chr1 3607747 3607842 - ID=MI0014499_1;accession_number=MI0014499

Chr1 3607802 3607822 - ID=MIMAT0017395_1;accession_number=MIMAT0017395

Chr1 10227277 10227339 - ID=MI0009752_1;accession_number=MI0009752

Chr1 10227315 10227337 - ID=MIMAT0009241_1;accession_number=MIMAT0009241

Chr1 19881347 19881431 - ID=MI0005457_1;accession_number=MI0005457

Chr1 19881398 19881419 - ID=MIMAT0003539_1;accession_number=MIMAT0003539

Chr1 19930459 19930542 - ID=MI0005454_1;accession_number=MI0005454

Chr1 19930511 19930532 - ID=MIMAT0004332_1;accession_number=MIMAT0004332

...

Martin

Hi Martin,

Can you please post the command line you're using?

And have you tried passing just a single interval from command line (e.g. -L Chr1:3580210-3580286) to see if that works properly?

Geraldine Van der Auwera, PhD

• Posts: 8Member

Hello,

I just try to write directly the interval and it worked perfectly.

The command line I used is:

java -Xmx2g -jar ~/Documents/Programms/GenomeAnalysisTK-2.1-8-g5efb575/GenomeAnalysisTK.jar -R Chromosome_3.1.fasta -T FastaAlternateReferenceMaker -o chr1_test_variant.fasta -L chr1_variant.intervals --variant indels_v1.Chr1.vcf

Thanks, Martin

I see. Then try deleting the header from your intervals file in case that's what's messing up the parsing.

Geraldine Van der Auwera, PhD

• Posts: 8Member

Without the header I get this error:

Badly formed genome loc: Contig Chr1 3580210 3580286 + ID=MI0011294' does not match any contig in the GATK sequence dictionary derived from the reference

Eh, right, without the header it doesn't know how to parse the rest. Not sure why it's not recognizing the Picard format in the first place when it has the header. On the off chance -- can you try running with your original intervals file (with header), but rename the extension to .list?

Geraldine Van der Auwera, PhD

• Posts: 8Member

I think I already try that, I just try it agin and I get the same error that before with the header.

I made up the interval file myself from a gff file, could it cause the problem?

• Posts: 8Member

Hello,

The last format you suggested in the inter

• Posts: 8Member

Hello again,

Just to say that I find the problem with the first tab separated value file for interval, it is parsed correctly with the extension .bed.

Thanks again for all the help!

Martin