If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.
CalculateGenotypePosteriors - do i need to run this on each trio individually?
I have a query regarding the “CalculateGenotypePosteriors” algorithm in GATK and would really appreciate insight from one of the team members.
I have 10 trios and I want to identify denovo mutations.
When i run CalculateGenotypePosteriors , is their any difference in how the genotype posterior will be calculated if I run the CalculateGenotypePosteriors on each trio individually or can I submit them altogether with 1 pedfile defining each of the 10 unique families.
In addition, in my VCF file I have empirical AF that was calculated across a larger set of 1900 individuals, my 10 families (30 individuals) are a subset of these.
When the CalculateGenotypePosterior and the subsequent VariantAnnotation for PossibleDeNovo variants is run, is the AC and AF determined from what is explicitly stated in the VCF file (i.e. the AF calculated from the 1900 individuals) or does it re estimate the AC or AF based on the individuals supplied in the VCF file.
Finally, if I am in interested in de novo mutations in 10 trios is it fine to provide VCF with 1900 individuals as long as the corresponding ped file identifies which individuals are part of the family and which are not. Should this have an impact on how the CalculateGenotypePosterior is performed?
Thanks in advance for your guidance. I was discussing the above with one of my colleagues and we unclear on whether the different approaches would impact the results.