what dose the AF value stands for in a combined vcf file?

Hello All
I used haplotypecaller to generate a trio samples gvcf files,then I used to CombineGVCFs to combined this family gvcf files together, then used GenotypeGVCFs to genotype the gvcf file to vcf. In the final VCF file,there is only one AF value and three column of 'GT:AD:DP:GQ:JL:JP:PL:PP' value
for example:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 63641 63652 63663
chr1 10492 . C T 22.60 PASS AC=2;AF=0.333;AN=6;BaseQRankSum=-1.834e+00;ClippingRankSum=0.00;DP=29;ExcessHet=3.9794;FS=0.000;MLEAC=2;MLEAF=0.500;MQ=36.46;MQRankSum=-4.310e-01;PG=0,0,0;QD=3.77;RAW_MQ=26585.00;ReadPosRankSum=0.524;SOR=1.179 GT:AD:DP:GQ:JL:JP:PL:PP 0/1:4,2:6:49:-1:-1:48,0,128:49,0,132 0/0:0,0:0:0:.:.:0,0,0:0,0,0 0/1:9,0:9:2:-1:-1:0,0,259 :2,0,261
chr1 10616 . CCGCCGTTGCAAAGGCGCGCCG C 145.13 PASS AC=2;AF=0.333;AN=6;DP=7;ExcessHet=3.0103;FS=0.000;MLEAC=2;M
LEAF=1.00;MQ=43.08;PG=0,0,0;QD=25.36;RAW_MQ=12990.00;SOR=3.258 GT:AD:DP:GQ:PL:PP 0/0:0,0:0:0:0,0,0:0,0,0 0/0:1,0:1:0:0, 0,0:0,0,0 1/1:0,4:4:12:180,12,0:180,12,0
chr1 12807 . C T 631.92 PASS AC=3;AF=0.500;AN=6;BaseQRankSum=1.63;ClippingRankSum=0.00;DP=66;ExcessHet=6
.9897;FS=0.000;MLEAC=3;MLEAF=0.500;MQ=19.28;MQRankSum=-3.598e+00;PG=0,0,0;QD=9.57;RAW_MQ=24540.00;ReadPosRankSum=1.47;SOR=0.324
GT:AD:DP:GQ:JL:JP:PL:PP 0/1:10,9:19:99:127:127:232,0,363:232,0,363 0/1:27,10:37:99:127:127:217,0,998:217,0,998 0/1:4,6:10:99:127:127:213,0,128:213,0,128

my question is, where's the AF value come from? does it calculated from the haplotypecaller and simply reserved one value from one of samples or calculated again in CombineGVCFs or GenotypeGVCFs ?

Since I'm trying to call De novo mutation and following the 'Genotype Refinement workflow for germline short variants', I just want to know the exact meaning of "AF<0.1%", is that the AF frequency of child or family?

Best Answer

Answers

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    Thanks @skyWarrior for pitching in.

    What would be helpful, @JinboWuGlasgow would be see the exact commands that you ran with the parameter settings.

  • Thanks for your [email protected] I thought the AF is the number of mapped reads on that loci, seems that I made a mistake.
    But I still have a question, in my variant file, AF only shows three number after the comma,I just wanna know it's that automatically or can be set in certain number ? Because I am trying to run 'Genotype Refinement workflow for germline short variants', and I need to pick out variants that AF<0.1%,but I found no variants have that AF value, only a few variants shows AF=0.00(Is that because the true value is too low that only shows 0.00?), Can we improve the precision of AF ?
  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @JinboWuGlasgow

    It is difficult to tell why this is without looking at the data. Would you please post example records of the variants. Thank you.

  • manbamanba Member ✭✭

    AF seems to be wrong before gatk4.0.10.0,

  • manbamanba Member ✭✭

    @JinboWuGlasgow said:
    Thanks for your [email protected] I thought the AF is the number of mapped reads on that loci, seems that I made a mistake.

    But I still have a question, in my variant file, AF only shows three number after the comma,I just wanna know it's that automatically or can be set in certain number ? Because I am trying to run 'Genotype Refinement workflow for germline short variants', and I need to pick out variants that AF<0.1%,but I found no variants have that AF value, only a few variants shows AF=0.00(Is that because the true value is too low that only shows 0.00?), Can we improve the precision of AF ?

    there are a lot of uninformative reads, so AF AD is not the value you see

  • > @bhanuGandham said:
    > Hi @JinboWuGlasgow
    >
    > It is difficult to tell why this is without looking at the data. Would you please post example records of the variants. Thank you.

    chr1 14451931 rs6701731 C *,T 5292.16 PASS AC=1,1;AF=0.167,0.167;AN=6;BaseQRankSum=0.067;ClippingRankSum=0.00;DB;DP=414;ExcessHet=3.9794;FS=9.296;MLEAC=1,1;MLEAF=0.167,0.167;MQ=43.99;MQRankSum=0.00;PG=0,0,0,0,0,0;QD=19.04;RAW_MQ=568800.00;ReadPosRankSum=1.09;SOR=1.087 GT:AD:DP:GQ:PGT:PID:PL:PP 0/1:65,65,0:130:99:0|1:14451910_T_C:2556,0,7021,2755,7220,9975:2556,0,7021,2755,7220,9975 0/2:74,0,74:148:99:.:.:2768,2994,6139,0,3145,2922:2768,2994,6139,0,3145,2922 0/0:120,0,0:120:99:.:.:0,120,1800,120,1800,1800:0,120,1800,120,1800,1800
    chr1 14722163 rs6673796 G *,A 5501.16 PASS AC=3,1;AF=0.500,0.167;AN=6;BaseQRankSum=-8.530e-01;ClippingRankSum=0.00;DB;DP=341;ExcessHet=3.9794;FS=0.426;MLEAC=3,1;MLEAF=0.500,0.167;MQ=36.18;MQRankSum=0.00;PG=0,0,0,0,0,0;QD=18.52;RAW_MQ=446400.00;ReadPosRankSum=0.664;SOR=0.663 GT:AD:DP:GQ:PL:PP 1/1:1,103,0:104:99:4141,281,0,4144,313,4176:4141,281,0,4144,313,4176 0/1:57,29,0:86:99:501,0,2141,672,2229,2900:501,0,2141,672,2229,2900 0/2:64,0,43:107:99:891,1087,3728,0,2642,2513:891,1087,3728,0,2642,2513
    chr1 14736256 rs4661301 T *,G 8360.16 PASS AC=3,1;AF=0.500,0.167;AN=6;BaseQRankSum=0.739;ClippingRankSum=0.00;DB;DP=387;ExcessHet=3.9794;FS=0.000;MLEAC=3,1;MLEAF=0.500,0.167;MQ=32.85;MQRankSum=0.00;PG=0,0,0,0,0,0;QD=25.18;RAW_MQ=417600.00;ReadPosRankSum=1.62;SOR=0.719 GT:AD:DP:GQ:PL:PP 1/1:1,135,0:136:99:5738,381,0,5741,407,5766:5738,381,0,5741,407,5766 0/1:69,59,0:128:99:2124,0,2487,2334,2665,4999:2124,0,2487,2334,2665,4999 0/2:39,0,29:68:99:530,652,2151,0,1499,1412:530,652,2151,0,1499,1412
    chr1 14791744 rs1721827 A C,G 16850.90 PASS AC=4,2;AF=0.667,0.333;AN=6;DB;DP=497;ExcessHet=3.0103;FS=0.000;MLEAC=4,2;MLEAF=0.667,0.333;MQ=35.81;PG=0,0,0,0,0,0;QD=34.46;RAW_MQ=637200.00;SOR=0.865 GT:AD:DP:GQ:PL:PP 1/2:0,65,59:124:99:4100,1862,1667,2235,0,2057:4100,1862,1667,2235,0,2057 1/1:0,190,0:190:99:6908,570,0,6908,570,6908:6908,570,0,6908,570,6908 1/2:0,86,89:175:99:5869,2926,2668,2941,0,2672:5869,2926,2668,2941,0,2672
    chr1 15050020 rs10927662 G A,* 7015.92 PASS AC=1,2;AF=0.167,0.333;AN=6;BaseQRankSum=0.245;ClippingRankSum=0.00;DB;DP=427;ExcessHet=6.9897;FS=0.806;MLEAC=1,2;MLEAF=0.167,0.333;MQ=32.42;MQRankSum=4.43;PG=0,0,0,0,0,0;QD=19.93;RAW_MQ=448801.00;ReadPosRankSum=0.525;SOR=0.755 GT:AD:DP:GQ:PGT:PID:PL:PP 0/2:63,0,48:111:99:0|1:15050019_AGT_A:1809,2000,4629,0,2629,2484:1809,2000,4629,0,2629,2484 0/1:43,48,0:91:99:0|1:15050020_G_A:1888,0,1653,2018,1800,3817:1888,0,1653,2018,1800,3817
    0/2:66,0,84:150:99:0|1:15050019_AGT_A:3349,3554,6365,0,2811,2552:3349,3554,6365,0,2811,2552
    chr1 15050093 rs74870928 C G,* 6888.92 PASS AC=2,1;AF=0.333,0.167;AN=6;BaseQRankSum=0.259;ClippingRankSum=0.00;DB;DP=449;ExcessHet=6.9897;FS=5.251;MLEAC=2,1;MLEAF=0.333,0.167;MQ=32.49;MQRankSum=-4.422e+00;PG=0,0,0,0,0,0;QD=18.37;RAW_MQ=473902.00;ReadPosRankSum=-7.390e-01;SOR=0.491 GT:AD:DP:GQ:PGT:PID:PL:PP 0/1:65,51,0:116:99:0|1:15050019_AGT_A:1902,0,2683,2101,2837,4938:1902,0,2683,2101,2837,4938 0/1:48,43,0:91:99:0|1:15050019_AGT_A:1653,0,1888,1800,2018,3817:1653,0,1888,1800,2018,3817 0/2:82,0,86:168:99:0|1:15050019_AGT_A:3364,3611,7055,0,3444,3185:3364,3611,7055,0,3444,3185
  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    Hi @JinboWuGlasgow Happy New Year!

    Have you taken a look at the JEXL tool to extract the variants that meet your threshold? More information can be found at this link

    I think this tool might be able to extract your de novo mutations that fall below a set frequency.

Sign In or Register to comment.