Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

convert hapmap to vcf format

blueskypyblueskypy Posts: 213Member
edited February 27 in Ask the GATK team

I’d like to convert a hapmap file to vcf. The hapmap file is from http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/latest/forward/non-redundant/genotypes_chr1_ASW_r27_nr.b36_fwd.txt.gz

A few questions about the following command at http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_VariantsToVCF.html#--dbsnp

java -Xmx2g -jar GenomeAnalysisTK.jar \
   -R ref.fasta \
   -T VariantsToVCF \
   -o output.vcf \
   --variant:RawHapMap input.hapmap \
   --dbsnp dbsnp.vcf
  1. Since the hapmap is in reference genome b36, should the ref.fasta be b36 as well? While b37 is everywhere, the only place I can find b36 is b36.3 at ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/NCBI/build36.3/Homo_sapiens_NCBI_build36.3.tar.gz, is this OK?
  2. What’s the usage of “-–dbsnp” here, should it be dbSNP built upon b36 as well?
  3. How do I use the codec at http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_hapmap_RawHapMapCodec.html, is it already built in VariantsToVCF or I have to download a codec file somewhere?


Post edited by blueskypy on


Sign In or Register to comment.