We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

RealignerTargetCreator error - difference between dbSNP and reference

SebBatSebBat BuffaloMember

I'm re-aligning some bam files using RealignerTargetCreator.
I downloaded the hg38 dbSNP file and hg38 reference genome. however, the dbSNP file contains the canonical chromosomes [1,2...X,Y,MT] while the reference (and my bam file) contain also the random chromosomes.

here's where I get the error:
MESSAGE: Input files and reference have incompatible contigs: No overlapping contigs found.
142_hg38.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT]
reference contigs = [chr1, chr1_GL383518v1_alt, chr1_GL383519v1_alt, chr1_GL383520v2_alt ... [etc etc etc.....] ... chrX_KI270881v1_alt, chrX_KI270913v1_alt, chrY, chrY_KI270740v1_random, chrM]

so...I would like to keep the random chromosomes in the reference/bam file, so how can I have this run without tossing an error?
or is this not possible and I need to keep ONLY the chrs in the dbSNP file?

thanks
Seb

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Are you sure that the dbsnp file is derived from hg38? Where did you get it?

  • SebBatSebBat BuffaloMember

    @Geraldine_VdAuwera said:
    Are you sure that the dbsnp file is derived from hg38? Where did you get it?

    I got it from here:
    http://hgdownload.cse.ucsc.edu/goldenpath/hg38/database/

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I don't see the 142_hg38.vcf filename in the directory you linked.

  • SebBatSebBat BuffaloMember

    they have the following:

      snp142.sql                                                   22-Nov-2015 18:21  3.2K  
      snp142.txt.gz                                                22-Nov-2015 18:27  2.8G  
      snp142CodingDbSnp.sql                                        24-Feb-2015 19:38  1.7K  
      snp142CodingDbSnp.txt.gz                                     24-Feb-2015 19:38   69M  
      snp142Common.sql                                             22-Nov-2015 18:38  3.2K  
      snp142Common.txt.gz                                          22-Nov-2015 18:39  594M  
      snp142ExceptionDesc.sql                                      24-Feb-2015 19:41  1.4K  
      snp142ExceptionDesc.txt.gz                                   24-Feb-2015 19:41  1.0K  
      snp142Flagged.sql                                            22-Nov-2015 18:38  3.2K  
      snp142Flagged.txt.gz                                         22-Nov-2015 18:38  2.5M  
      snp142Mult.sql                                               22-Nov-2015 18:38  3.2K  
      snp142Mult.txt.gz                                            22-Nov-2015 18:38  1.3K  
      snp142OrthoPt4Pa2Rm3.sql                                     24-Feb-2015 19:41  2.2K  
      snp142OrthoPt4Pa2Rm3.txt.gz                                  24-Feb-2015 19:45  3.0G  
      snp142Seq.sql                                                24-Feb-2015 19:57  1.3K  
      snp142Seq.txt.gz                                             24-Feb-2015 19:57  673M  
    

    the one I used is snp142.txt.gz...do you think it's better to use only the "canonical" chrs by filtering them out?

    Issue · Github
    by Sheila

    Issue Number
    459
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • SebBatSebBat BuffaloMember

    thanks Geraldine, I'll definitely use that bundle then!!

  • SebBatSebBat BuffaloMember
    edited January 2016

    hmmm seems that the page is not available..is it up and running? I keep getting:
    Description: internal error - server connection terminated
    --- edit ----
    using the link in the Download page it works :)

Sign In or Register to comment.