I run Mutect2 in conjunction with a different variant caller and I try to keep the cutoffs as similar as possible. I don't expect the results to be identical. One caller might return twice as many variants, but those will generally include most of the ones from the other caller.
However, occasionally the differences are substantial. One of them might return 5x as many variants and with a poor overlap. Clearly, there is a problem with my samples in that case. However, I am not sure how to quantify the underlying cause. Is there a systematic way to try to diagnose some quality issues?
For example, the depth and evenness of coverage are important, but are not necessarily sufficient. Are there other metrics I should be tracking?