Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GT, PL and GQ inconsistent

hangdhangd Baylor College of medicineMember

From the GATK documentation I found that for PL field, the most likely genotype (assigned in the GT field) is 0. However, from the vcf file called by GATK, I found that this is not always the case. Below are a few examples:

GT:AD:DP:GQ:PL
0/1:0,232:444:99:0,382,4800
1/1:1,0:1:2:1,0,0
1/1:1,0:1:99:0,3,35
1/1:1,0:1:87:0,292,247

Also, GQ is equal to the second smallest PL, unless that PL is greater than 99. However, I also found cases like the following, where the GQ is 1.76:

GT:AD:DP:GQ:PL
0/1:9,14,23:1.76:30,3,0

I am wondering that whether they have specific meanings or it is due to calling error, or due to the low quality of calling. Thank you very much.

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @hangd
    Hi,

    Can you tell us the exact command you ran to generate the VCF and the version of GATK you are using?

    Thanks,
    Sheila

  • hangdhangd Baylor College of medicineMember

    Hi, thank you very much. These variants were not called by me, and I know very little about GATK, so I attached the meta info in vcf files.

    For the following calling, GATK 2.0 was used. The meta information is in vcfheader1.txt.
    GT:AD:DP:GQ:PL
    0/1:0,232:444:99:0,382,4800
    1/1:1,0:1:2:1,0,0
    1/1:1,0:1:99:0,3,35
    1/1:1,0:1:87:0,292,247

    For the following calling, I am not sure which version of GATK was used. The meta information is in vcfheader2.txt.
    GT:AD:DP:GQ:PL
    0/1:9,14,23:1.76:30,3,0

    Thank you again.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    These all look like errors. The calling was done with an old tool (UnifiedGenotyper) in a very old version, so I'm not surprised there would be some problematic sites. If you have the opportunity to go back to the read data and re-call all of these that's what I would recommend.

Sign In or Register to comment.