Can someone please explain to me what 50 columns represents individually in variant output file of MuTect:

chr1 49530 TCGxTTG C T cancer nromal 0 NOVEL COVERED 0.997898 0.997898 1 1 1 145 1 5.400145 15.759951 4.864089 11.01621 0.095238 0.02 2.047348 73 57 6 2105 215 60 60 0 0 CC 16.992822 0.014085 71 70 1 2549 39 0.894338 0.871416 (28,29,2,4) 31 11.5 35.5 16.5 0 KEEP


  • It looks to me like "context" is also not explained on that page. Is the Context column (TCGxTTG) documented anywhere? I initially thought it was giving the bases before and after the SNV base but for many of my calls, that's not consistent with the ref_allele (or alt_allele) field.

  Geraldine_VdAuwera

    Hi @Clare,

    The Context column should give you exactly that, the bases before and after your SNP (which is represented by x in the output). Have you checked the reference sequence around the position in question? The allele fields would not be informative to evaluate whether it is correct or not.

  perrye

    Is it possible to generate an output that lists the reasons a particular mutation failed to pass the filters, so that we can exclude some filters in the analysis? For example, the strand bias filter is not appropriate for my dataset.

  Geraldine_VdAuwera

    Hi @perrye,

    If you upgrade to MuTect 1.1.7 (available on the downloads page), you will get a "failure_reasons" column in the callstats output file.

    In the "failure_reasons" column (second to last before judgement) of the call_stats file, there's an identifier for the relevant filters. "strand_artifact" is the tag for the strand bias filter. As in the Nature Biotech paper: "Candidates are rejected when the strand-specific LOD [t_lod_fstar_forward or t_lod_fstar_reverse columns] is < 2.0 in directions where the sensitivity to have passed that threshold [power_to_detect_positive_strand_artifact and power_to_detect_negative_strand_artifact columns] is ≥ 90%.

    See the paper for more details on other filters or the existing forum synopsis:

