GT, PL and GQ inconsistent

hangdhangd Baylor College of medicineMember

From the GATK documentation I found that for PL field, the most likely genotype (assigned in the GT field) is 0. However, from the vcf file called by GATK, I found that this is not always the case. Below are a few examples:

GT:AD:DP:GQ:PL
0/1:0,232:444:99:0,382,4800
1/1:1,0:1:2:1,0,0
1/1:1,0:1:99:0,3,35
1/1:1,0:1:87:0,292,247

Also, GQ is equal to the second smallest PL, unless that PL is greater than 99. However, I also found cases like the following, where the GQ is 1.76:

GT:AD:DP:GQ:PL
0/1:9,14,23:1.76:30,3,0

I am wondering that whether they have specific meanings or it is due to calling error, or due to the low quality of calling. Thank you very much.

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @hangd
    Hi,

    Can you tell us the exact command you ran to generate the VCF and the version of GATK you are using?

    Thanks,
    Sheila

  • hangdhangd Baylor College of medicineMember

    Hi, thank you very much. These variants were not called by me, and I know very little about GATK, so I attached the meta info in vcf files.

    For the following calling, GATK 2.0 was used. The meta information is in vcfheader1.txt.
    GT:AD:DP:GQ:PL
    0/1:0,232:444:99:0,382,4800
    1/1:1,0:1:2:1,0,0
    1/1:1,0:1:99:0,3,35
    1/1:1,0:1:87:0,292,247

    For the following calling, I am not sure which version of GATK was used. The meta information is in vcfheader2.txt.
    GT:AD:DP:GQ:PL
    0/1:9,14,23:1.76:30,3,0

    Thank you again.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    These all look like errors. The calling was done with an old tool (UnifiedGenotyper) in a very old version, so I'm not surprised there would be some problematic sites. If you have the opportunity to go back to the read data and re-call all of these that's what I would recommend.

Sign In or Register to comment.