We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
ReadBacked phasing vs Trio phasing?

I think I understand the technical difference. But in terms of phasing quality, how does one compare to the another? Are there any publications/reports/blog posts comparing the two? Is there some quantifiable metric that shows how different the estimates are?
Best Answer
-
Geraldine_VdAuwera Cambridge, MA admin
It's pretty apples-and-oranges. They're largely complementary. The physical phasing just tells you which alleles segregate together; whereas the pedigree-based phasing additionally tells you (to some extent) which parent provided which alleles. From a strictly technical standpoint you could argue that physical phasing is superior because it relies on fewer assumptions (mostly just the basecalls and read mappings) compared to pedigree phasing which relies on genotypes (themselves dependent on basecalls and read mappings), and therefore sits on top of a bigger pile of assumptions. But at the same time the pedigree phasing tools have access to a scope of information that the physical phasing tools don't, and can phase things at greater distances.
So in short I'm not convinced it's useful to view them as either-or. If you can do both, and the info each provides is useful to you, I would do both.
For the record, the reason we don't have tools dedicated to pedigree phasing (or even standalone physical phasing, anymore) is because we consider that those are largely downstream analyses that we don't have the bandwidth to focus on, and others have this covered.
Answers
@rmf
Hi,
Read backed phasing does physical phasing based on the variants in the reads. The other does phasing based on the known genotypes of the parents. Considering that we have only retained tools that do physical phasing (read backed phasing), I think the team must be pretty confident that type is good enough.
I have to check with the team and get back to you on some good publications.
-Sheila
It's pretty apples-and-oranges. They're largely complementary. The physical phasing just tells you which alleles segregate together; whereas the pedigree-based phasing additionally tells you (to some extent) which parent provided which alleles. From a strictly technical standpoint you could argue that physical phasing is superior because it relies on fewer assumptions (mostly just the basecalls and read mappings) compared to pedigree phasing which relies on genotypes (themselves dependent on basecalls and read mappings), and therefore sits on top of a bigger pile of assumptions. But at the same time the pedigree phasing tools have access to a scope of information that the physical phasing tools don't, and can phase things at greater distances.
So in short I'm not convinced it's useful to view them as either-or. If you can do both, and the info each provides is useful to you, I would do both.
For the record, the reason we don't have tools dedicated to pedigree phasing (or even standalone physical phasing, anymore) is because we consider that those are largely downstream analyses that we don't have the bandwidth to focus on, and others have this covered.