About the contig ordering in the reference(b37/hg19)

TianchengTiancheng Posts: 1Member

I have read this on GATK's documents:

Human sequence

If you are using human data, your reads must be aligned to one of the official b3x (e.g. b36, b37) or hg1x (e.g. hg18, hg19) references. The contig ordering in the reference you used must exactly match that of one of the official references canonical orderings. These are defined by historical karotyping of largest to smallest chromosomes, followed by the X, Y, and MT for the b3x references; the order is thus 1, 2, 3, ..., 10, 11, 12, ... 20, 21, 22, X, Y, MT. The hg1x references differ in that the chromosome names are prefixed with "chr" and chrM appears first instead of last. The GATK will detect misordered contigs (for example, lexicographically sorted) and throw an error.

that said the reference order must be: chr1, chr2, chr3, ... chr22, chrX, chrY, chrM. but after I download all the bundle in GATK's ftp, I check's reference, it's with a order of :

>chrM
>chr1
>chr2
>chr3
>chr4
>chr5
>chr6
>chr7
>chr8
>chr9
>chr10
>chr11
>chr12
>chr13
>chr14
>chr15
>chr16
>chr17
>chr18
>chr19
>chr20
>chr21
>chr22
>chrX
>chrY
...

so, is it contradictory?

Best Answer

Answers

  • HasaniHasani GermanyPosts: 21Member

    And how can I solve this issue?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,405Administrator, GATK Developer admin

    @Hasani, are you having an issue that is different from the one discussed in the ReorderSam thread? If it's the same problem we will fix it there. If it's a different issue, please describe the problem in detail.

    Geraldine Van der Auwera, PhD

  • HasaniHasani GermanyPosts: 21Member

    Yes! This is simply the file I downloaded from the ../bundle/2.8/hg19/ along with the dict and fai.
    The exact error :

    ERROR MESSAGE: Input files reads and reference have incompatible contigs: Relative ordering of overlapping contigs differs, which is unsafe.
    ERROR reads contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, c

    hrY, chrM]

    ERROR reference contigs = [chrM, chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr2

    2, chrX, chrY, ....]

  • HasaniHasani GermanyPosts: 21Member

    Forgot to say thank you :)

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,405Administrator, GATK Developer admin

    @‌Hasani,

    It sounds like you just need to reorder your bam file. To do this you need to run ReorderSam, so let's keep talking about this in that thread since it is more appropriate and relevant.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.