PhaseByTransmission with more than just trio

mlindermmlinderm Posts: 22Member
edited December 2012 in Ask the GATK team

Is it possible to use PhaseByTransmission with families that are larger than a single trio? I have a family with four siblings. If I include all of the siblings in the PED I get:

PhaseByTransmission - Caution: Family BMD has 6 members; At the moment Phase By Transmission only supports trios and parent/child pairs. Family skipped.
ERROR MESSAGE: Bad input: No PED file passed or no trios found in PED file. Aborted.

And if I just include the one key trio with the proband, I get the following:

ERROR MESSAGE: Sample BMD006_R found in data sources but not in pedigree files with STRICT pedigree validation

There does not seem to be an accessible argument for relaxing the pedigree validation. Is there a way to use PhaseByTransmission with my larger family?

Post edited by Geraldine_VdAuwera on

Best Answer

  • LaurentLaurent Posts: 35Member, GSA Collaborator
    edited December 2012 Answer ✓

    Hi Mlinderm,

    Currently PhaseByTransmission only supports trios so you won't be able to use the information about all 4 siblings jointly.

    Including just the trio of interest is the correct way to go for the moment, however if you leave the other siblings in the VCF file, you should either:

    • Add the children in the PED file but code them as unrelated individuals (they will simply be ignored)
    • Specify the flag --pedigreeValidationType SILENT. This flag lets the GATK run even if not all individuals are found in both the PED and VCF file.

    Cheers, Laurent

    Post edited by Geraldine_VdAuwera on

Answers

  • trgalltrgall Posts: 13Member

    I would like to use PhaseByTransmission to identify Mendelian errors and phase the children in a family with two parents and four children. Would this be possible by creating 4 "trios", with 4 family IDs, but with the parents the same in each trio?

  • LaurentLaurent Posts: 35Member, GSA Collaborator

    Hi trgall,

    At the moment PhaseByTransmission only takes trios in. If you want to identify the mendelian errors in your case, you will unfortunately need to run it once per child. You cannot pass the same individuals with different family IDs as this would not respect the PED format. So Unfortunately for your purpose you'll need to run the tool 4 times. Note that if you use --pedigreeValidationType SILENT, you can leave all children in the VCF file and simply pass a different PED file for each child.

    Cheers, Laurent

  • mjcgeneticsmjcgenetics Posts: 1Member

    Maybe this is more a feature request than a question, but it would be nice if this particular app could intelligently break a pedigree into trios and phase each trio and then package it back up into the original VCF for us. Much better than requiring the user to split the VCF, split the PED, and run it once for each trio, and then sew it back up him or herself. For example, if we give it two parents and three children, it does each child's phasing and puts it all back into the original multisample VCF format which GATK itself produces during genotyping.

Sign In or Register to comment.