Lifting over a VCF from one reference to another

delangeldelangel Posts: 71GATK Developer mod
edited September 2013 in Methods and Workflows

liftOverVCF.pl

Contents

Introduction

This script converts a VCF file from one reference build to another. It runs 3 modules within our toolkit that are necessary for lifting over a VCF.
1. LiftoverVariants walker
2. sortByRef.pl to sort the lifted-over file
3. Filter out records whose ref field no longer matches the new reference

Obtaining the Script

The liftOverVCF.pl script is available in our public source repository under the 'perl' directory. Instructions for pulling down our source are available here.

Example

./liftOverVCF.pl -vcf calls.b36.vcf \
  -chain b36ToHg19.broad.over.chain \
  -out calls.hg19.vcf \
  -gatk /humgen/gsa-scr1/ebanks/Sting_dev
  -newRef /seq/references/Homo_sapiens_assembly19/v0/Homo_sapiens_assembly19
  -oldRef /humgen/1kg/reference/human_b36_both
  -tmp /broad/shptmp [defaults to /tmp]

Usage

Running the script with no arguments will show the usage:

Usage: liftOverVCF.pl
    -vcf        <input vcf>
    -gatk       <path to gatk trunk>
    -chain      <chain file>
    -newRef     <path to new reference prefix; we will need newRef.dict, .fasta, and .fasta.fai>
    -oldRef     <path to old reference prefix; we will need oldRef.fasta>
    -out        <output vcf>
    -tmp            <temp file location; defaults to /tmp>
  • The 'tmp' argument is optional. It specifies the location to write the temporary file from step 1 of the process.


Chain files

Chain files from b36/hg18 to hg19 are located here within the Broad:

   /humgen/gsa-hpprojects/GATK/data/Liftover_Chain_Files/

External users can get them off our ftp site:

   location: ftp.broadinstitute.org
   username: gsapubftp-anonymous
   path:     Liftover_Chain_Files
Post edited by Geraldine_VdAuwera on

Comments

Sign In or Register to comment.