The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

converting hg19 annotations to b37 coordinates

laosaallaosaal Member Posts: 1
edited January 2013 in Ask the GATK team

We have some annotation files, for example a GTF file of UCSC's "Known Genes" in hg19 coordinates. We'd like to convert this to b37 coordinates. What's the best way to go about doing this? Assistance would be appreciated!
Thanks in advance,

Post edited by Geraldine_VdAuwera on


  • ebanksebanks Broad InstituteMember, Broadie, Dev Posts: 692 admin

    If you can convert those files to VCF then you can use our liftover script (described on this forum). Otherwise, you won't be able to do this through the GATK.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • sej1985sej1985 Member Posts: 3
    edited November 2012

    I am trying to convert a vcf file of b36 build to hg19.
    First few line of my vcf file:

    ##INFO=<ID=Database,Number=1,Type=String,Description="Database identifier">
    ##INFO=<ID=Dbxref,Number=.,Type=String,Description="Database reference">
    ##INFO=<ID=dbID,Number=1,Type=String,Description="Database identifier">
    ##INFO=<ID=ID,Number=.,Type=String,Description="Chromosome or contig">
    ##INFO=<ID=Alias,Number=.,Type=String,Description="Mostly novel variant">
    ##INFO=<ID=Variant_seq,Number=.,Type=String,Description="Alternate Allele">
    ##INFO=<ID=Genotype,Number=.,Type=String,Description="Homozyguous or Heterozyguous">
    ##INFO=<ID=Variant_reads,Number=.,Type=Integer,Description="Number of reads where variant present">
    ##INFO=<ID=Total_reads,Number=.,Type=Integer,Description="Total number of reads">
    ##INFO=<ID=Reference_seq,Number=1,Type=String,Description="Ancestral allele">
    <a href="/gatk/search?Search=%23CHROM&Mode=like">#CHROM</a> POS ID  REF ALT QUAL    FILTER  INFO
    chr1    4793    .   A   G   25  .   ID=chr1:SoapSNP:SNV:4793;Alias=YHSNP0128643;Variant_seq=A,G;Reference_seq=A;Variant_reads=48,26;Total_reads=74;Genotype=heterozygous
    chr1    6434    .   G   A   48  .   ID=chr1:SoapSNP:SNV:6434;Alias=YHSNP0128644;Variant_seq=A,G;Reference_seq=G;Variant_reads=10,11;Total_reads=21;Genotype=heterozygous
    chr1    93896   rs4287120   T   C   51  .   ID=chr1:SoapSNP:SNV:93896;Dbxref=dbSNP:rs4287120;Variant_seq=C,T;Reference_seq=T;Variant_reads=5,4;Total_reads=9;Genotype=heterozygous
    chr1    225707  rs6603780   C   G   43  .   ID=chr1:SoapSNP:SNV:225707;Dbxref=dbSNP:rs6603780;Variant_seq=C,G;Reference_seq=C;Variant_reads=23,12;Total_reads=35;Genotype=heterozygous
    chr1    225839  rs6422503   C   A   31  .   ID=chr1:SoapSNP:SNV:225839;Dbxref=dbSNP:rs6422503;Variant_seq=A,C;Reference_seq=C;Variant_reads=13,5;Total_reads=18;Genotype=heterozygous
    chr1    526849  .   G   T   76  .   ID=chr1:SoapSNP:SNV:526849;Alias=YHSNP0128645;Variant_seq=G,T;Reference_seq=G;Variant_reads=14,12;Total_reads=26;Genotype=heterozygous
    chr1    554731  rs1832728   T   C   30  .   ID=chr1:SoapSNP:SNV:554731;Dbxref=dbSNP:rs1832728;Variant_seq=C,T;Reference_seq=T;Variant_reads=37,12;Total_reads=49;Genotype=heterozygous
    chr1    555353  rs7349153   T   C   28  .   ID=chr1:SoapSNP:SNV:555353;Dbxref=dbSNP:rs7349153;Variant_seq=C,T;Reference_seq=T;Variant_reads=37,9;Total_reads=46;Genotype=heterozygous
    chr1    555371  rs9283150   G   A   22  .   

    I have the vcf file validated using vcftools vcf-validator. But when I use the LiftOverVariants tool, it gives me the error:
    The providedVCF file is malformed at approximately line number 13: Trying to create a VariantContext with a ID key. Please use provided constructor argument ID.

    Please can someone tell me how to fix this?

    Post edited by Geraldine_VdAuwera on
  • Mark_DePristoMark_DePristo Broad InstituteMember Posts: 153 admin

    You cannot have an ID key in the INFO field.

    Mark A. DePristo, Ph.D.
    Co-Director, Medical and Population Genetics
    Broad Institute of MIT and Harvard

  • ebanksebanks Broad InstituteMember, Broadie, Dev Posts: 692 admin

    @sej1985 - Mark is correct that in GATK 2.2 "ID" is an invalid key for the INFO field of the VCF. However this restriction will be lifted for our 2.3 release whenever that comes out. Thanks for posting this.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

Sign In or Register to comment.