This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Reference genotype quality in presence of conflicting reads
Given the following line from a VCF file:
Supercontig_1.1 308 . T A . PASS AC=0;AF=0;AN=2;BaseQRankSum=-1.932;ClippingRankSum=0;DP=99;ExcessHet=3.01;MQ=43.26;MQRankSum=-2.382;ReadPosRankSum=1.77;VariantType=SNP GT:AD:DP:RGQ 0/0:52,4:56:1
I note that despite there being 52 reads passing filters for the reference genotype, the reference genotype quality is still only 1. Is RGQ affected by the presence of reads indicating a possible variant (4 in this case)? So the low RGQ score in this case reflects uncertainty over whether this position really is reference call (T/T), or if it might be a variant (A/A or A/T or T/A).
If I was being super strict about only including highly certain positions in my analysis would you recommend that I assign this position a missing genotype because I can't really be sure what it is?