ReadBackedPhasing --respectPhaseInput

dmyersturnbulldmyersturnbull Stanford UniversityMember

The 2013 "best practices" workshop slides recommend running PhaseByTransmission followed by ReadBackedPhasing --respectPhaseInput.

  1. The --respectPhaseInput option is not currently listed in the documentation. Does that mean that RBP now always respects phasing in the input VCF?

  2. Does (or did) --respectPhaseInput cause phased sites in the input to be assumed correct, or are they just ignored? That is, does RBP --respectPhaseInput use the partial haplotypes from the input file as evidence?


  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Douglas,

    You no longer need to use ReadBackedPhasing. HaplotypeCaller does physical phasing for you. We only use ReadBackedPhasing for merging MNPs. The updated slides from our latest workshop are here: https://www.broadinstitute.org/gatk/blog?id=5338

    It is true the --respectPhaseInput argument is gone. We will check with the developers on what the current behavior is for dealing with phased input.


  • dmyersturnbulldmyersturnbull Stanford UniversityMember

    Hi Sheila,

    I'd forgotten that about HaplotypeCaller.

    Will PhaseByTransmission respect previous phasing or use it as evidence? I noticed that HaplotypeCaller uses the annotations PID and PGT rather than GT (I never see a | in the GT from HC). Does PBT use those?

    I'm trying to use ShapeIt2 for phasing, but it can't phase variants that aren't in the reference panel (1000G phase 3), so I'm hoping to fill those in afterward using PBT and either RBP or HC's PGT annotations. Filling in the unphased GTs using PGT annotations seems like it could result in inconsistent haplotypes, so I'm wondering whether PBT and RBP can "build off of" ShapeIt2's partial phasing. That's why I'm interested in this.


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @dmyersturnbull PhaseByTransmission doesn't know about HC's phasing tags, so it will neither take them into account nor change them in any way. The fact that they store the phasing information in different forms is done on purpose to avoid collisions.

    I looked briefly at the PBT and RBP code but it wasn't obvious to me how the tools treats incoming phasing info. The phasing tools are developed by external collaborators, @mfromer and @Laurent -- hopefully they can have a look and answer your questions. If not I'll try to hunt down the answer from the rest of the team next week.

  • dmyersturnbulldmyersturnbull Stanford UniversityMember

    Thanks, @Sheila, @Geraldine_VdAuwera, and @Laurent. That clears it up completely!


