The frontline support team will be slow on the forum because we are occupied with the GATK Workshop on March 21st and 22nd 2019. We will be back and more available to answer questions on the forum on March 25th 2019.
Reference genotype quality in presence of conflicting reads
Given the following line from a VCF file:
Supercontig_1.1 308 . T A . PASS AC=0;AF=0;AN=2;BaseQRankSum=-1.932;ClippingRankSum=0;DP=99;ExcessHet=3.01;MQ=43.26;MQRankSum=-2.382;ReadPosRankSum=1.77;VariantType=SNP GT:AD:DP:RGQ 0/0:52,4:56:1
I note that despite there being 52 reads passing filters for the reference genotype, the reference genotype quality is still only 1. Is RGQ affected by the presence of reads indicating a possible variant (4 in this case)? So the low RGQ score in this case reflects uncertainty over whether this position really is reference call (T/T), or if it might be a variant (A/A or A/T or T/A).
If I was being super strict about only including highly certain positions in my analysis would you recommend that I assign this position a missing genotype because I can't really be sure what it is?