We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

VCF liftover question

Dear GATK team,

My VCF was generated using GATK v3 and the hg19 reference. I'd like to compare it to the latest 1000G data, e.g. "1000G_phase3_v4_20130502.sites.vcf" from the b37 folder in the GATK ref bundle.
(Is this the right phase 3 data to use or do I need to download the original from the 1000G ftp site?)

According to this: https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_variantutils_LiftoverVariants.php
I'd like to use the LiftoverVariants function to liftover my VCF to b37 ref. Is this the right thing to do? If so, can you please tell me where I can file the required chain file "liftover_hg19_to_b37.txt"?

If not, could you please recommend the right tool to liftover a VCF? It looks like there is also picard's LiftoverVcf and gatk's liftOverVCF.pl

Many thanks! Look forward to hearing form you.



  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Jin,

    You can use either GATK's or Picard's tool for lifting over a VCF. The chain files are available in the bundle. https://www.broadinstitute.org/gatk/guide/article.php?id=1213


  • JinSzatJinSzat Member

    Hi Sheila,

    Thank you. I made some progress and now have additional questions. Here is what I did:

    I did not see chain files within the bundle folders but I found them here ftp://ftp.broadinstitute.org/Liftover_Chain_Files/
    I ran the LiftoverVariants tool using hg19tob37.chain and my job completed with no error.

    Program Args: -T LiftoverVariants -R gatk_bundle/2.8/hg19//ucsc.hg19.fasta -V merged.dedup.realn.recal.rsid.PASS.vcf.gz -chain Liftover_Chain_Files/hg19tob37.chain -dict gatk_bundle/2.8/b37/human_g1k_v37.dict.gz -o merged.dedup.realn.recal.rsid.PASS.b37.vcf

    I compared the hg19 and lifted-over v37 versions of the VCF and found that (1) the header of the lifted-over file still has contig names in hg19 (is this correct?); (2) the positions were lifted-over fine (e.g. changing from chr1 to 1).

    chr1 98000073 rs190136297
    1 98000073 rs190136297

    When I proceeded to run FilterLiftedVariants, it failed with the error messages as shown below. Perhaps the problem is that the header in the lifted-over vcf still has hg19 contig names? How do I fix this error and complete liftover? Thanks so much. Look forward to hearing from you!

    INFO 11:36:33,060 HelpFormatter - ---------------------------------------------------------------------------------
    INFO 11:36:33,065 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-46-gbc02625, Compiled 2015/07/09 17:38:12
    INFO 11:36:33,065 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 11:36:33,065 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 11:36:33,068 HelpFormatter - Program Args: -T FilterLiftedVariants -R gatk_bundle/2.8/hg19//ucsc.hg19.fasta -V merged.dedup.realn.recal.rsid.PASS.b37.vcf -o merged.dedup.realn.recal.rsid.PASS.b37.liftoverfiltered.vcf
    INFO 11:36:33,089 HelpFormatter - Executing as [email protected] on Linux 2.6.18-238.12.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_11-b12.
    INFO 11:36:33,089 HelpFormatter - Date/Time: 2015/09/08 11:36:33
    INFO 11:36:33,089 HelpFormatter - ---------------------------------------------------------------------------------
    INFO 11:36:33,089 HelpFormatter - ---------------------------------------------------------------------------------
    INFO 11:36:33,582 GenomeAnalysisEngine - Strictness is SILENT
    INFO 11:36:33,745 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO 11:36:35,780 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR A USER ERROR has occurred (version 3.4-46-gbc02625):
    ERROR This means that one or more arguments or inputs in your command are incorrect.
    ERROR The error message below tells you what is the problem.
    ERROR If the problem is an invalid argument, please check the online documentation guide
    ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ERROR MESSAGE: Input files merged.dedup.realn.recal.rsid.PASS.b37.vcf and reference have incompatible contigs: No overlapping contigs found.
    ERROR merged.dedup.realn.recal.rsid.PASS.b37.vcf contigs = [1]
    ERROR reference contigs = [chrM, chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chr1_gl000191_random, chr1_gl000192_random

    , chr4_ctg9_hap1, chr4_gl000193_random, chr4_gl000194_random, chr6_apd_hap1, chr6_cox_hap2, chr6_dbb_hap3, chr6_mann_hap4, chr6_mcf_hap5, chr6_qbl_hap6, chr6_ssto_hap7, chr7_gl000195_random, chr8_gl000196_random, chr8_gl000197_random, chr9_
    gl000198_random, chr9_gl000199_random, chr9_gl000200_random, chr9_gl000201_random, chr11_gl000202_random, chr17_ctg5_hap1, chr17_gl000203_random, chr17_gl000204_random, chr17_gl000205_random, chr17_gl000206_random, chr18_gl000207_random, ch
    r19_gl000208_random, chr19_gl000209_random, chr21_gl000210_random, chrUn_gl000211, chrUn_gl000212, chrUn_gl000213, chrUn_gl000214, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chrUn_gl000218, chrUn_gl000219, chrUn_gl000220, chrUn_gl00022
    1, chrUn_gl000222, chrUn_gl000223, chrUn_gl000224, chrUn_gl000225, chrUn_gl000226, chrUn_gl000227, chrUn_gl000228, chrUn_gl000229, chrUn_gl000230, chrUn_gl000231, chrUn_gl000232, chrUn_gl000233, chrUn_gl000234, chrUn_gl000235, chrUn_gl00023
    6, chrUn_gl000237, chrUn_gl000238, chrUn_gl000239, chrUn_gl000240, chrUn_gl000241, chrUn_gl000242, chrUn_gl000243, chrUn_gl000244, chrUn_gl000245, chrUn_gl000246, chrUn_gl000247, chrUn_gl000248, chrUn_gl000249]

    ERROR ------------------------------------------------------------------------------------------
  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Jin,

    Can you try with Picard's LiftoverVcf? http://broadinstitute.github.io/picard/command-line-overview.html#LiftoverVcf
    That should work. I think the error you are receiving is a known bug in GATK's tool.


Sign In or Register to comment.