We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

tagSNPs with GATK

biobiobiobio EuropeMember

I am currently evaluating different methods to select tagSNPs for a gene. Is it possible to identify tagSNPs for a gene with GATK by scanning a given list of SNPs?
Thank you very much.


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    If I understand correctly, you want to test samples for variation at specific sites? If so, yes you can do that with GATK. The exact workflow to follow depends on how your data was generated. Is it whole genoms data, exome data, or gene panel data?

  • biobiobiobio EuropeMember

    Thank you for your reply.
    I want to use publicly available data about SNPs at one site, in order to identify tagSNPs as markers for two possible haplotypes. The workflow I had in mind, was to download a list of SNPs for this gene, convert the list into needed formats for GATK and then use GATK to determine linkage disequilibrium of the SNPs and therefore tagSNPs.

  • tommycarstensentommycarstensen United KingdomMember ✭✭✭
    edited March 2015

    @biobio Is it perhaps an option for you to use PLINK1.9, if you are dealing with just one site? If you want to use haplotype input instead of estimating LD with the EM algorithm from unphased genotypes (and your data is in the SHAPEIT2 .haps format), then I have a Python script for calculating LD:

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Oh I see, thanks for clarifying -- that's not something you would use GATK for. Then @tommycarstensen's recommendation of using PLINK is more appropriate.

Sign In or Register to comment.