We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Merge BAM to VCF. Which is the best workflow?

cardilloxcardillox Member
edited January 2013 in Ask the GATK team

Dear All,
I am very new to the analysis of NGS data.

I would like to merge the information of sample 1029 from HGDP (http://cdna.eva.mpg.de/denisova/VCF/human/HGDP01029.hg19_1000g.12.mod.vcf.gz) to SAN sample in Schuster et al 2010 ftp://ftp.bx.psu.edu/data/bushman/hg18/bam/KB1illumChr12.bam)

If I well understood, I should call the variants from the bam file and then merge with the vcf. Is it correct? Could you gently suggest me the best way to do it in your opinion? When should i convert my files to the same reference sequence?

In addition I am looking at http://gatkforums.broadinstitute.org/discussion/1186/best-practice-variant-detection-with-the-gatk-v4-for-release-2-0,
and I am trying to do Variant Detection on the example file NA12878. I have some doubt,
Where I can find MarkDuplicates tool? Should I invoke it just with -T argument? Or Do I need to install it?

I am really sorry, I am trying to understand GATK, but it is not rally intuitive, so of you have any tips or recommendation please let me know it.

Post edited by Geraldine_VdAuwera on

Best Answer


  • cardilloxcardillox Member
    edited November 2012

    thank you, i hope to solve this problem, since I have some files in vcf format and one in bam format

    thank you for your help!

Sign In or Register to comment.