Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

CalculateGenotypePosteriors and pedigree file

I am trying to use the CalculateGenotypePosteriors tool using only family priors. I understand that the pedigree file should be in plink format. I have two large half-sib families with RAD-Seq data. So my pedigree file (for just the first family) is like
1 1500 0 0 0 0
1 1591 0 0 0 0
1 30029 1500 1591 0 0
1 30030 1500 1591 0 0
1 30031 1500 1591 0 0
etc… However, when I use that pedigree file, I just get an output .vcf without the PP field etc…

Having read more from the forum, it is not clear whether or not the pedigree file can take more than a trio for this tool. I have tried giving multiple ped files (each with a trio) to the same run of the analysis, e.g.
Ped1.ped
1 1500 0 0 0 0
1 1591 0 0 0 0
1 30029 1500 1591 0 0
and Ped2.ped
1 1500 0 0 0 0
1 1591 0 0 0 0
1 30030 1500 1591 0 0
and again I got an output .vcf without the PP field etc…

I also tried coding the pedigree in one file like this:
1 1500 0 0 0 0
1 1591 0 0 0 0
1 30029 1500 1591 0 0
2 1500 0 0 0 0
2 1591 0 0 0 0
2 30030 1500 1591 0 0
etc….., where I got the error “Inconsistent values detected for 1500 for field Family_ID value1 1 value2 2”

I also tried with just one trio in the pedigree…which worked. But the analysis is taking ~2 hours, so I cannot do this for all ~900 progeny.

Can you please tell me what I am doing wrong, or how to speed up the analysis? Which of the above options should be working?

Thanks

Answers

  • IzSIzS UKMember

    sorry, I meant full-sib not half-sib

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @IzS
    Hi,

    Yes, unfortunately only trios are accepted. You may look into using Queue. https://www.broadinstitute.org/gatk/guide/article?id=1306

    -Sheila

  • GuyReevesGuyReeves MPIMember

    @ Sheila
    I have the exactly the same problem as LzS.

    Is it correct to say that the ped file can only have one trio in it?

    Shouldn't it be possible to have a single ped file where trios are broken up in to multiple families?

    1 S340 0 0 1 0
    1 S329 0 0 2 0
    1 161 S340 S329 0 0
    This ped file works

    1 S340 0 0 1 0
    1 S329 0 0 2 0
    1 161 S340 S329 0 0
    2 S340 0 0 1 0
    2 S329 0 0 2 0
    2 162 S340 S329 0 0
    This ped file gives ##### ERROR MESSAGE: Inconsistent values detected for S340 for field Family_ID value1 1 value2 2
    There are no extra lines or spaces in this file.

    using
    -R dm6_iso_all.fasta -T SelectVariants -V mini3.vcf -o violations.vcf -ped PED1.txt -mv --pedigreeValidationType SILENT
    or
    -R dm6_iso_all.fasta -T SelectVariants -V mini3.vcf -o violations.vcf -ped PED1.txt -mv
    The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b,
    Thanks

    Guy

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @GuyReeves
    Hi Guy,

    Unfortunately, for now, we can only accept trios. I will talk to the team and see what they can do to make things easier.

    -Sheila

    P.S. You should upgrade to the latest version of GATK!

  • steps372steps372 South AfricaMember

    Any update on whether you guys are working to accept more than one trio at a time to calculate Genotype Posteriors? This will make my life much easier.

    Issue · Github
    by Sheila

    Issue Number
    1212
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    chandrans
  • steps372steps372 South AfricaMember

    Also, in a pedigree with two parents and multiple progeny, will chronological runs of CalculateGenotypePosteriors (to determine GPs of every progeny, one by one) alter GPs for the parents?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @steps372
    Hi,

    The tool can accept a pedigree file with multiple trios. What it doesn't accept is pedigree groups of more than three people per family unit.

    It is possible for the tool to alter parents' genotypes, but it should happen at low quality sites.

    -Sheila

Sign In or Register to comment.