On Monday and Tuesday, November 12-13, the communications team will be out of the office for a U.S. federal holiday and a team event. We will be back in action on November 14th and apologize for any inconvenience this may cause. Thank you for using the forum.

tagSNPs with GATK

biobiobiobio EuropeMember

I am currently evaluating different methods to select tagSNPs for a gene. Is it possible to identify tagSNPs for a gene with GATK by scanning a given list of SNPs?
Thank you very much.


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    If I understand correctly, you want to test samples for variation at specific sites? If so, yes you can do that with GATK. The exact workflow to follow depends on how your data was generated. Is it whole genoms data, exome data, or gene panel data?

  • biobiobiobio EuropeMember

    Thank you for your reply.
    I want to use publicly available data about SNPs at one site, in order to identify tagSNPs as markers for two possible haplotypes. The workflow I had in mind, was to download a list of SNPs for this gene, convert the list into needed formats for GATK and then use GATK to determine linkage disequilibrium of the SNPs and therefore tagSNPs.

  • tommycarstensentommycarstensen United KingdomMember ✭✭✭
    edited March 2015

    @biobio Is it perhaps an option for you to use PLINK1.9, if you are dealing with just one site? If you want to use haplotype input instead of estimating LD with the EM algorithm from unphased genotypes (and your data is in the SHAPEIT2 .haps format), then I have a Python script for calculating LD:

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Oh I see, thanks for clarifying -- that's not something you would use GATK for. Then @tommycarstensen's recommendation of using PLINK is more appropriate.

Sign In or Register to comment.