The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Unknown reference error using FastaAlternateReferenceMaker

MartinBMartinB Member
edited September 2013 in Ask the GATK team

Hello

I tried to run FastaAlternateReferenceMaker and I get the following error:

WARNING 2013-09-18 16:28:28 IntervalList    Ignoring interval for unknown reference: Chr1:3580210-3580286

For all the intervals I submitted. I already looked around on the web, and I did not find any answer, knowing that my chromosome names are all with the 'Chr' format in all the files and that my interval files are tab delimited.

My interval file look like:

@HD     VN:1.4  SO:unsorted
@SQ     SN:Chr1 LN:158337067    UR:file:chromosome_3.1.fasta    M5:0631b350aa263a0f714de8ba9d609eb0
@SQ     SN:Chr2 LN:137060424    UR:file:Chromosome_3.1.fasta    M5:15898469d6142f8bb74f769bfe9b155f
@SQ     SN:Chr3 LN:121430405    UR:file:Chromosome_3.1.fasta    M5:c515c4da7c2cd2d24c9487db8f733cfd
...
Chr1    3580210 3580286 +       ID=MI0011294_1;accession_number=MI0011294
Chr1    3580220 3580240 +       ID=MIMAT0011792_1;accession_number=MIMAT0011792
Chr1    3607747 3607842 -       ID=MI0014499_1;accession_number=MI0014499
Chr1    3607802 3607822 -       ID=MIMAT0017395_1;accession_number=MIMAT0017395
Chr1    10227277        10227339        -       ID=MI0009752_1;accession_number=MI0009752
Chr1    10227315        10227337        -       ID=MIMAT0009241_1;accession_number=MIMAT0009241
Chr1    19881347        19881431        -       ID=MI0005457_1;accession_number=MI0005457
Chr1    19881398        19881419        -       ID=MIMAT0003539_1;accession_number=MIMAT0003539
Chr1    19930459        19930542        -       ID=MI0005454_1;accession_number=MI0005454
Chr1    19930511        19930532        -       ID=MIMAT0004332_1;accession_number=MIMAT0004332
...

The header of my interval file is a copy of the Chromosome_3.1.dict
I do not know what is misformated and why I get this error

Thanks

Martin

Best Answers

Answers

  • Here is, more clearly, what the interval file look like:

    @HD VN:1.4 SO:unsorted

    @SQ SN:Chr1 LN:158337067 UR:file:chromosome_3.1.fasta M5:0631b350aa263a0f714de8ba9d609eb0\n

    @SQ SN:Chr2 LN:137060424 UR:file:Chromosome_3.1.fasta M5:15898469d6142f8bb74f769bfe9b155f

    @SQ SN:Chr3 LN:121430405 UR:file:Chromosome_3.1.fasta M5:c515c4da7c2cd2d24c9487db8f733cfd

    ...

    Chr1 3580210 3580286 + ID=MI0011294_1;accession_number=MI0011294

    Chr1 3580220 3580240 + ID=MIMAT0011792_1;accession_number=MIMAT0011792

    Chr1 3607747 3607842 - ID=MI0014499_1;accession_number=MI0014499

    Chr1 3607802 3607822 - ID=MIMAT0017395_1;accession_number=MIMAT0017395

    Chr1 10227277 10227339 - ID=MI0009752_1;accession_number=MI0009752

    Chr1 10227315 10227337 - ID=MIMAT0009241_1;accession_number=MIMAT0009241

    Chr1 19881347 19881431 - ID=MI0005457_1;accession_number=MI0005457

    Chr1 19881398 19881419 - ID=MIMAT0003539_1;accession_number=MIMAT0003539

    Chr1 19930459 19930542 - ID=MI0005454_1;accession_number=MI0005454

    Chr1 19930511 19930532 - ID=MIMAT0004332_1;accession_number=MIMAT0004332

    ...

    Martin

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Martin,

    Can you please post the command line you're using?

    And have you tried passing just a single interval from command line (e.g. -L Chr1:3580210-3580286) to see if that works properly?

  • Hello,

    I just try to write directly the interval and it worked perfectly.

    The command line I used is:

    java -Xmx2g -jar ~/Documents/Programms/GenomeAnalysisTK-2.1-8-g5efb575/GenomeAnalysisTK.jar -R Chromosome_3.1.fasta -T FastaAlternateReferenceMaker -o chr1_test_variant.fasta -L chr1_variant.intervals --variant indels_v1.Chr1.vcf

    Thanks, Martin

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    I see. Then try deleting the header from your intervals file in case that's what's messing up the parsing.

  • Without the header I get this error:

    Badly formed genome loc: Contig Chr1 3580210 3580286 + ID=MI0011294' does not match any contig in the GATK sequence dictionary derived from the reference

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Eh, right, without the header it doesn't know how to parse the rest. Not sure why it's not recognizing the Picard format in the first place when it has the header. On the off chance -- can you try running with your original intervals file (with header), but rename the extension to .list?

  • I think I already try that, I just try it agin and I get the same error that before with the header.

    I made up the interval file myself from a gff file, could it cause the problem?

  • Hello,

    The last format you suggested in the inter

  • Hello again,

    Just to say that I find the problem with the first tab separated value file for interval, it is parsed correctly with the extension .bed.

    Thanks again for all the help!

    Martin

Sign In or Register to comment.