If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

CalculateGenotypePosteriors - do i need to run this on each trio individually?

cmorgan1cmorgan1 LondonMember


I have a query regarding the “CalculateGenotypePosteriors” algorithm in GATK and would really appreciate insight from one of the team members.

I have 10 trios and I want to identify denovo mutations.

When i run CalculateGenotypePosteriors , is their any difference in how the genotype posterior will be calculated if I run the CalculateGenotypePosteriors on each trio individually or can I submit them altogether with 1 pedfile defining each of the 10 unique families.

In addition, in my VCF file I have empirical AF that was calculated across a larger set of 1900 individuals, my 10 families (30 individuals) are a subset of these.

When the CalculateGenotypePosterior and the subsequent VariantAnnotation for PossibleDeNovo variants is run, is the AC and AF determined from what is explicitly stated in the VCF file (i.e. the AF calculated from the 1900 individuals) or does it re estimate the AC or AF based on the individuals supplied in the VCF file.

Finally, if I am in interested in de novo mutations in 10 trios is it fine to provide VCF with 1900 individuals as long as the corresponding ped file identifies which individuals are part of the family and which are not. Should this have an impact on how the CalculateGenotypePosterior is performed?

Thanks in advance for your guidance. I was discussing the above with one of my colleagues and we unclear on whether the different approaches would impact the results.



  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @cmorgan1

    We are sorry, the forum volume is high right now and we are only focusing on tool error questions right now. Will are unable to answer experimental design question at the moment. Please feel free to look through GATK user guide for more information on functionality of the tools.

Sign In or Register to comment.