The current GATK version is 3.2-2

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

# converting hg19 annotations to b37 coordinates

Posts: 1Member
edited January 2013

Hi, We have some annotation files, for example a GTF file of UCSC's "Known Genes" in hg19 coordinates. We'd like to convert this to b37 coordinates. What's the best way to go about doing this? Assistance would be appreciated! Thanks in advance, Lao

Post edited by Geraldine_VdAuwera on
Tagged:

• Posts: 678GATK Developer mod

If you can convert those files to VCF then you can use our liftover script (described on this forum). Otherwise, you won't be able to do this through the GATK.

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

• Posts: 2Member
edited November 2012

I am trying to convert a vcf file of b36 build to hg19. First few line of my vcf file:

##fileformat=VCFv4.0
##INFO=<ID=Database,Number=1,Type=String,Description="Database identifier">
##INFO=<ID=Dbxref,Number=.,Type=String,Description="Database reference">
##INFO=<ID=dbID,Number=1,Type=String,Description="Database identifier">
##INFO=<ID=ID,Number=.,Type=String,Description="Chromosome or contig">
##INFO=<ID=Alias,Number=.,Type=String,Description="Mostly novel variant">
##INFO=<ID=Variant_seq,Number=.,Type=String,Description="Alternate Allele">
##INFO=<ID=Genotype,Number=.,Type=String,Description="Homozyguous or Heterozyguous">
##INFO=<ID=Reference_seq,Number=1,Type=String,Description="Ancestral allele">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
chr1    4793    .   A   G   25  .   ID=chr1:SoapSNP:SNV:4793;Alias=YHSNP0128643;Variant_seq=A,G;Reference_seq=A;Variant_reads=48,26;Total_reads=74;Genotype=heterozygous
chr1    6434    .   G   A   48  .   ID=chr1:SoapSNP:SNV:6434;Alias=YHSNP0128644;Variant_seq=A,G;Reference_seq=G;Variant_reads=10,11;Total_reads=21;Genotype=heterozygous
chr1    93896   rs4287120   T   C   51  .   ID=chr1:SoapSNP:SNV:93896;Dbxref=dbSNP:rs4287120;Variant_seq=C,T;Reference_seq=T;Variant_reads=5,4;Total_reads=9;Genotype=heterozygous
chr1    225707  rs6603780   C   G   43  .   ID=chr1:SoapSNP:SNV:225707;Dbxref=dbSNP:rs6603780;Variant_seq=C,G;Reference_seq=C;Variant_reads=23,12;Total_reads=35;Genotype=heterozygous
chr1    225839  rs6422503   C   A   31  .   ID=chr1:SoapSNP:SNV:225839;Dbxref=dbSNP:rs6422503;Variant_seq=A,C;Reference_seq=C;Variant_reads=13,5;Total_reads=18;Genotype=heterozygous
chr1    526849  .   G   T   76  .   ID=chr1:SoapSNP:SNV:526849;Alias=YHSNP0128645;Variant_seq=G,T;Reference_seq=G;Variant_reads=14,12;Total_reads=26;Genotype=heterozygous
chr1    554731  rs1832728   T   C   30  .   ID=chr1:SoapSNP:SNV:554731;Dbxref=dbSNP:rs1832728;Variant_seq=C,T;Reference_seq=T;Variant_reads=37,12;Total_reads=49;Genotype=heterozygous
chr1    555353  rs7349153   T   C   28  .   ID=chr1:SoapSNP:SNV:555353;Dbxref=dbSNP:rs7349153;Variant_seq=C,T;Reference_seq=T;Variant_reads=37,9;Total_reads=46;Genotype=heterozygous
chr1    555371  rs9283150   G   A   22  .


I have the vcf file validated using vcftools vcf-validator. But when I use the LiftOverVariants tool, it gives me the error: The providedVCF file is malformed at approximately line number 13: Trying to create a VariantContext with a ID key. Please use provided constructor argument ID.

Please can someone tell me how to fix this? Thanks

Post edited by Geraldine_VdAuwera on