Dear Sheila, and Geraldine,

also please could you advise on understanding the data in the VCF files of NORMAL and TUMOR for the variant at this position (from MUTECT2) : chr21 46689620 : especially the FS, SOR, and the AD.

chr21 46689620 . G A . PASS BaseCounts=7,0,27,0;ECNT=1;FS=0.000;HCNT=1;HRun=4;MAX_ED=.;MIN_ED=.;MQ=33.87;NLOD=6.16;SOR=2.636;TLOD=10.90;VariantType=SNP GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/1:20,6:0.240:5:1:0.833:408,145:8:12 0/0:21,0:0.00:0:0:.:435,0:11:10

I have also attached the images from samtools tview - for NORMAL and TUMOR : will describe the questions and the calculations below ;)


    now about the computations for the G to A substitution :smile:

    for the alternative allele - in NORMAL (1st image) : AD=21,0 : that looks correct;
    for the alternative allele - in TUMOR (2nd image) : AD= 20,6, although after the visual count it shall be 30,7.

    why MUTECT reports 20,6 instead of 30,7. Then, another question is : A nucleotide at the position chr21:46689620 seems to be highly strand-biased, but the FS calculation is 0 ? why ? Then the SOR value is 2.6, although this number does not fit the formula described at :smile:

    Am I misunderstanding the fields and is there anything else missing ? thanks again !

    Regarding the AD and DP, I understand that AD includes the NON-FILTERED COUNTS, and DP includes the FILTERED COUNTS,
    I am just surprised that FS calculation is 0, although there is an obvious strand bias, Also how the SOR is computed to 2.6 ? thanks !

