How does phase by transmission deal with X chromosome variants

I have some new vcf files that are phased by transmission. There are numerous calls that change from 1/1 with haplotype caller to 1|0. Can you tell my why this is the case or is it because pbt doesn't deal with x or y variants properly.

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @jhunter
    Hi,

    Can you post some examples where the genotype changed? What were the steps you ran prior to Phase By Transmission? Did you follow the Genotype Refinement Workflow? http://gatkforums.broadinstitute.org/discussion/4723/genotype-refinement-workflow

    -Sheila

  • jhunterjhunter tgenMember

    Here are some examples. Each variant is listed 2x. unphased is regular haplotype caller genotype. The other took the HC file and used PBT. I didn't build the pipe, I am mostly incompetent when it comes to bioinformatics. I will forward your question to our bioinformatition. Either way, an X genotype in a male child should always be hemizygous from mother, and never have an X variant from father. Examples below show change from 1/1 child, 0/1 mother, 0/0 father that change to 0|1, 1|0, 0|0. also see 0/0 child, 0/0 mother, 1/1 father going to 0|1, 0|0, 1|1. it seems PBT is treating X and Y as autosomes which they are not.

    LUZP4 chrX:114524365 rs10482480 LUZP4(P14S) missense_variant Yes PASS 1129.9 1/1 0/1 0/0
    LUZP4 chrX:114524365 rs10482480 LUZP4(P14S) missense_variant Yes PASS 1129.9 1|0 1|0 0|0
    RHOXF2 chrX:119293293 rs142899626 RHOXF2(R151H) DNA-binding region (Homeobox ) missense_variant Yes DNA-binding_region:Homeobox LOW PASS 1080.13 0/0 0/0 0/1
    RHOXF2 chrX:119293293 rs142899626 RHOXF2(R151H) DNA-binding region (Homeobox ) missense_variant Yes DNA-binding_region:Homeobox LOW PASS 1080.13 0|0 0|0 0|1
    ZBTB33 chrX:119387946 rs75782705 ZBTB33(I226V) missense_variant Yes PASS 3685.14 0/0 0/0 1/1
    ZBTB33 chrX:119387946 rs75782705 ZBTB33(I226V) missense_variant Yes PASS 3685.14 0/0 0/0 1/1
    CXorf64 chrX:125954900 rs41309536 CXorf64(Y93Y) synonymous_variant None PASS 1841.9 1/1 0/1 0/0
    CXorf64 chrX:125954900 rs41309536 CXorf64(Y93Y) synonymous_variant None PASS 1841.9 1|0 1|0 0|0
    FRMD7 chrX:131212572 rs181962233 FRMD7(M491I) missense_variant Yes PASS 1410.13 0/0 0/1 0/0
    FRMD7 chrX:131212572 rs181962233 FRMD7(M491I) missense_variant Yes PASS 1410.13 0|0 0|1 0|0
    SOX3 chrX:139586617 rs45586631 SOX3(Y203Y) synonymous_variant ;not_specified DNA-binding_region:HMG_box LOW PASS 2054.14 0/0 0/0 1/1
    SOX3 chrX:139586617 rs45586631 SOX3(Y203Y) synonymous_variant ;not_specified DNA-binding_region:HMG_box LOW PASS 2054.14 0|1 0|0 1|1
    ACE2 chrX:15582209 rs35803318 ACE2(V749V) transmembrane region (Helical; ), splice variant ((in isoform 2) ) synonymous_variant None transmembrane_region:Transmembrane_region LOW PASS 3571.14 1/1 1/1 0/0
    ACE2 chrX:15582209 rs35803318 ACE2(V749V) transmembrane region (Helical; ), splice variant ((in isoform 2) ) synonymous_variant None transmembrane_region:Transmembrane_region LOW PASS 3571.14 1|0 1|1 0|0
    CD99 chrX:2609543 . CD99(R1R) signal peptide synonymous_variant None PASS 659.16 0/1 0/1 0/0
    CD99 chrX:2609543 CD99(R1R) signal peptide synonymous_variant None PASS 659.16 1|0 1|0 0|0
    DMD chrX:32716110 rs1800265 DMD(T279T) synonymous_variant ;not_specified|not_provided PASS 1997.15 0/0 0/0 1/1
    DMD chrX:32716110 rs1800265 DMD(T279T) synonymous_variant ;not_specified PASS 1997.15 0|1 0|0 1|1
    CHDC2 chrX:36162664 rs5973599 CHDC2(Intron Variant) intron_variant None PASS 657.15 0/0 0/0 1/1
    CHDC2 chrX:36162664 rs5973599 CHDC2(Intron Variant) intron_variant None PASS 657.15 0|1 0|0 1|1

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    FYI, the authors of PhaseByTransmission (who are not part of our team) have written a new version that improves how the X chromosome is handled. We're working with them to get this new version integrated into the official GATK release.

  • andrewoandrewo Member

    Hi, just following up on this thread. Has this been integrated into any of the official GATK releases? I recently noticed the same issue, conversion of homozygous genotype to 1|0 genotype, even though the allele counts were 44,0 (ref, alt).

    Before phasing:

    0/0:44,0:44:21:0,21,315 0/1:46,56:102:99:1727,0,1297    1/1:0,40:40:99:1396,120,0
    

    After phasing:

    1|0:44,0:44:0:0,21,315:19   1|0:46,56:102:99:1727,0,1297:19 1|1:0,40:40:99:1396,120,0:19
    
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @andrewo
    Hi,

    Are these from sex chromosomes? It is indeed possible for PhaseByTransmission to convert genotypes after using the PLs. Notice the genotype that was converted has a low GQ.

    -Sheila

  • andrewoandrewo Member

    @Sheila @Geraldine_VdAuwera

    Yes, sorry I didn't make that clear, this variant is on the X chromosome. This is a trio: dad, mom, son. I think it converts the dad's genotype (first listed) from homozygous reference to heterozygous because the son is homozygous for the alternate allele, so it thinks that both parents must have at least one alternate allele. This is true for autosomes but not for X. It would be nice if PBT didn't convert homozygous to heterozygous on the X chromosome unless there was good evidence for this (e.g. high alternate allele count and individual is female). It sounds like there is a newer version of PBT that has better handling of sex chromosomes based on Geraldine's comment from Sep 2015. I was just wondering if that version was integrated into GATK at some point or if GATK still uses the older version of PBT that doesn't treat sex chromosomes differently from autosomes.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @andrewo
    Hi,

    No, it looks like PhaseByTransmission does not treat the sex chromosomes differently. I suspect the new version was never integrated into GATK.

    -Sheila

    P.S. You may be interested in this thread and this thread.

Sign In or Register to comment.