Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

RealignerTargetCreator error - difference between dbSNP and reference

SebBatSebBat BuffaloMember

I'm re-aligning some bam files using RealignerTargetCreator.
I downloaded the hg38 dbSNP file and hg38 reference genome. however, the dbSNP file contains the canonical chromosomes [1,2...X,Y,MT] while the reference (and my bam file) contain also the random chromosomes.

here's where I get the error:
MESSAGE: Input files and reference have incompatible contigs: No overlapping contigs found.
142_hg38.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT]
reference contigs = [chr1, chr1_GL383518v1_alt, chr1_GL383519v1_alt, chr1_GL383520v2_alt ... [etc etc etc.....] ... chrX_KI270881v1_alt, chrX_KI270913v1_alt, chrY, chrY_KI270740v1_random, chrM]

so...I would like to keep the random chromosomes in the reference/bam file, so how can I have this run without tossing an error?
or is this not possible and I need to keep ONLY the chrs in the dbSNP file?

thanks
Seb

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Are you sure that the dbsnp file is derived from hg38? Where did you get it?

  • SebBatSebBat BuffaloMember

    @Geraldine_VdAuwera said:
    Are you sure that the dbsnp file is derived from hg38? Where did you get it?

    I got it from here:
    http://hgdownload.cse.ucsc.edu/goldenpath/hg38/database/

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I don't see the 142_hg38.vcf filename in the directory you linked.

  • SebBatSebBat BuffaloMember

    they have the following:

      snp142.sql                                                   22-Nov-2015 18:21  3.2K  
      snp142.txt.gz                                                22-Nov-2015 18:27  2.8G  
      snp142CodingDbSnp.sql                                        24-Feb-2015 19:38  1.7K  
      snp142CodingDbSnp.txt.gz                                     24-Feb-2015 19:38   69M  
      snp142Common.sql                                             22-Nov-2015 18:38  3.2K  
      snp142Common.txt.gz                                          22-Nov-2015 18:39  594M  
      snp142ExceptionDesc.sql                                      24-Feb-2015 19:41  1.4K  
      snp142ExceptionDesc.txt.gz                                   24-Feb-2015 19:41  1.0K  
      snp142Flagged.sql                                            22-Nov-2015 18:38  3.2K  
      snp142Flagged.txt.gz                                         22-Nov-2015 18:38  2.5M  
      snp142Mult.sql                                               22-Nov-2015 18:38  3.2K  
      snp142Mult.txt.gz                                            22-Nov-2015 18:38  1.3K  
      snp142OrthoPt4Pa2Rm3.sql                                     24-Feb-2015 19:41  2.2K  
      snp142OrthoPt4Pa2Rm3.txt.gz                                  24-Feb-2015 19:45  3.0G  
      snp142Seq.sql                                                24-Feb-2015 19:57  1.3K  
      snp142Seq.txt.gz                                             24-Feb-2015 19:57  673M  
    

    the one I used is snp142.txt.gz...do you think it's better to use only the "canonical" chrs by filtering them out?

    Issue · Github
    by Sheila

    Issue Number
    459
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • SebBatSebBat BuffaloMember

    thanks Geraldine, I'll definitely use that bundle then!!

  • SebBatSebBat BuffaloMember
    edited January 2016

    hmmm seems that the page is not available..is it up and running? I keep getting:
    Description: internal error - server connection terminated
    --- edit ----
    using the link in the Download page it works :)

Sign In or Register to comment.