VQSR - ./. genotypes retained after VQSR filtering

We have generated a set of variant calls based on the GRCh38 pipeline described in GATK using GATK version 3.3. We observed that many calls made on the ALT contigs had the genotype call "./." . On proceeding with VQSR and subsequently filtering we observed that most ./. variant calls in the ALT contigs had PASS filter value assigned after VQSR. We have used the known site based on the files provided in the gatk38 bundle except the 1000G_phase1.snps.high_confidence.hg38.vcf.

chr7_KI270803v1_alt 524239 . C T 234.71 PASS AC=2;AF=1.00;AN=2;DP=6;FS=0.000;GQ_MEAN=12.00;GQ_STDDEV=8.49;MQ=46.55;MQ0=0;NCC=1;NDA=2;QD=30.97;VQSLOD=3.83;culprit=FS GT:AD:DP:GQ:PL 1/1:0,6:6:18:266,18,0
chr7_KI270803v1_alt 775700 . T G 109.42 PASS AC=0;AF=0.00;AN=0;DP=0;FS=0.000;GQ_MEAN=9.00;MQ=39.86;MQ0=0;NCC=2;NDA=2;QD=27.98;VQSLOD=4.17;culprit=FS GT:AD:DP ./.:0,0:0
chr7_KI270803v1_alt 775721 . C A 99.42 PASS AC=0;AF=0.00;AN=0;DP=0;FS=0.000;GQ_MEAN=9.00;MQ=42.75;MQ0=0;NCC=2;NDA=2;QD=33.14;VQSLOD=3.69;culprit=FS GT:AD:DP ./.:0,0:0
chr7_KI270803v1_alt 775753 . A G 101.42 PASS AC=0;AF=0.00;AN=0;DP=0;FS=0.000;GQ_MEAN=9.00;MQ=53.41;MQ0=0;NCC=2;NDA=2;QD=33.81;VQSLOD=4.41;culprit=FS GT:AD:DP ./.:0,0:0
chr7_KI270803v1_alt 775980 . T C 452.9 PASS AC=0;AF=0.00;AN=0;DP=0;FS=0.000;GQ_MEAN=33.00;MQ=57.78;MQ0=0;NCC=2;NDA=2;QD=35.59;VQSLOD=1.47;culprit=FS GT:AD:DP ./.:0,0:0
chr21_GL383580v2_alt 64196 . T C 64.88 PASS AC=0;AF=0.00;AN=0;DP=0;FS=0.000;GQ_MEAN=6.00;MQ=57.03;MQ0=0;NCC=2;NDA=2;QD=32.44;VQSLOD=7.32;culprit=FS GT:AD:DP ./.:0,0:0
chr22_KI270878v1_alt 163711 . G GT 98.17 PASS AC=2;AF=1.00;AN=2;DP=4;FS=0.000;GQ_MEAN=12.00;MQ=60.00;MQ0=0;NCC=2;NDA=2;QD=24.54;VQSLOD=6.84;culprit=FS GT:AD:DP:GQ:PL 1/1:0,4:4:12:124,12,0

Can you please explain why variants with ./. are designated as PASS by VQSR.



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    VQSR filters variants at the site level, not the genotype level. You can have a site where we're reasonably confident that there is variation but we aren't able to assign a genotype because the data is inconclusive; e.g. it could equally be a heterozygous or homozygous variant.
