Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

Run GATK for multiple populations

JautisJautis DukeMember

Hi, I have sequencing data for two sister populations which I wish to call genotypes for. While I expect variants to be shared between populations, the allele frequencies may be different. Is there a way I can specify that these samples come from two source populations for GATK's joint haplotype caller?

Thanks!

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Jautis
    Hi,

    Can you tell us a little bit more about these populations and what your end goal is?

    Thanks,
    Sheila

  • JautisJautis DukeMember

    Hi, these are two species of baboons. The end goal is to do local ancestry calling, but the immediate problem is that I need independent allele frequency estimates for the two populations. The species likely share variable sites, so information about known variants in one population is useful for the other population, but we don't want genotype calls/allele frequency estimates to be informed by the allele frequency in the other species

    Sorry for the delayed response! I didn't realize I needed to opt-in to receive e-mail notifications on posts

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @Jautis

    One way to do this would be to use haplotypecaller separately on the two populations and then use MergeVcf to combine and compare variants. That way their AF from one population will not interfere with the AF from the other.

Sign In or Register to comment.