meanings of some column of CountVariants table in VariantEval output

zuoxyzuoxy ChinaMember

The VariantEval provides many useful statistics for evaluation of the callsets. However, the documentations I could found in GATK website were somewhat out of date, resulting in some columns not easy to understand or guess. I have two questions:
1. What are the meanings of the nMNPs, nComplex, nSymbolic, nMixed, nHomDerived in a CountVariants table in a VariantEval report?
2. How can I found some formula or explanation of all the calculation in VariantEval?

Thanks a lot!



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    1. Those are the counts of MNPs (multi-nucleotide polymorphisms), complex variants, symbolic alleles, mixed records (eg both SNP and indel alleles at the same site), and -- actually I don't know what HomDerived is. Will have to look that one up.

    2. Ultimately, in the GATK code. We don't currently have formula details available for all VariantEval calculations in the documentation, and producing this is not a priority. If you have questions about the meaning of specific metrics or calculations we can answer that (and add to the docs) but we can't systematically add formula details at this time.

