how to tell whether a variant is a sequence error or real?

I want to know whether there are some references talking about how gatk tell a variant is an error or real ?in both germline and mutect2 call


  2904359495 Member

    can anyone help?

  2904359495 Member

    Thanks a lot. here I am not talking about the doc, I am saying once you said we can use MBQ and AD to calculate the error reads .

    for example, a site like following.

    chr20 57484460 . TG T . PASS DP=236;ECNT=2;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=-5.988e+01;RPA=2,1;RU=G;STR;TLOD=8.76 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:OBAM:OBAMRC:ORIGINAL_CONTIG_MISMATCH:SA_MAP_AF:SA_POST_PROB 0/1:230,6:0.029:236:230,6:0,0:28,30:171,171:60:51:false:false:0:0.020,0.010,0.025:3.382e-03,5.781e-03,0.991

    so MBQ for ref and alt is 28 and 30, about 0.001 error rate, 236*0.001 = 0.236, much lower than 6, so it passed, you ever answered a question likie this

  davidben Boston Member, Broadie, Dev

    To be clear, the algorithm of Mutect2 is much more sophisticated than this. However, back-of-the-envelope calculations such as the above can still give a rough sense of things. To understand what Mutect2 actually does one has to read the documentation.

  2904359495 Member
    the doc is really too difficult to understand, if there can be some concrete examples,such as a real variant site, to explain how muetct2 calculate it out, it will be much more friendly to us common people.
    thanks a lot.

    for example, site like following
    chr18 48593530 . TA T . clustered_events;t_lod DP=234;ECNT=5;POP_AF=5.000e-08;P_CONTAM=0.00;P_GERMLINE=-5.263e+01;RPA=2,1;RU=A;STR;TLOD=3.15 GT:AD:AF:DP:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:OBAM:OBAMRC:ORIGINAL_CONTIG_MISMATCH:SA_MAP_AF:SA_POST_PROB 0/1:191,3:0.020:194:191,3:0,0:22,26:171,171:60:16:false:false:0:0.020,0.00,0.015:0.013,3.253e-03,0.984

    I know tumor-lod (default threshold 5.3) , here is 3.15, so give the t_lod, but I am concerned is how this site distinguish sequence error from a real somatic variant.
    here alt base quality is 26,so error rate is 0.0025, 194*0.0025 = 0.48, also seems ok.
    so can you analyse this site through this method

