GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

Meaning of terms in output of GenotypeConcordance

blueskypyblueskypy Posts: 229Member
edited October 2013 in Ask the GATK team

Just to make sure my understanding is correct:

HET: heterozygous
HOM_REF: homozygous reference
HOM_VAR: homozygous variant
MIXED: something like `./1`
Mismatching_Alleles: ??
UNAVAILABLE: for internal use
ALLELES_MATCH: ??
ALLELES_DO_NOT_MATCH: ??
EVAL_ONLY: ??
TRUTH_ONLY: does it actually mean the variants present in comp but not in eval, like COMP_ONLY?

how does the following computed?

Non-Reference_Discrepancy
Non-Reference_Sensitivity  
Overall_Genotype_Concordance

Thanks a lot!

Post edited by blueskypy on

Answers

  • blueskypyblueskypy Posts: 229Member

    Could anyone please help with this? I appreciate!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    @blueskypy, I've asked the tool's author to answer you but he is very busy so you'll need to be a little patient.

    Geraldine Van der Auwera, PhD

  • chartlchartl Posts: 11GATK Developer mod

    Hi blueskypy,

    Thanks for your patience. Your intuition on all counts has been correct.

    ALLELES_MATCH are counts of calls at the same site where the alleles match

    ALLELES_DO_NOT_MATCH are counts of calls at the same location with different alleles, such as the eval set calling a 'G' alternate allele, and the comp set calling a 'T' alternate allele.

    Eval only are the counts of sites present only in the eval VCF, and not in the comp.

    Non-reference sensitivity is the sensitivity of the eval calls to polymorphic calls in the comp set, that is (# true positive)/(# true polymorphic).

    Overall genotype concordance is just (# concordant genotypes)/(# genotypes)

    This tends to be high just because reference calls predominate; so we use in addition the Non-reference discrepancy, which, loosely, is the genotype concordance excluding concordant reference sites. See attached.

    pdf
    pdf
    nrd.pdf
    101K
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    For future reference, I'm adding this information to the GenotypeConcordance tool documentation, which will be updated with the next release.

    Geraldine Van der Auwera, PhD

  • blueskypyblueskypy Posts: 229Member

    Thanks, Geraldine! Happy New Year!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,528Administrator, GATK Developer admin

    Happy New Year to you too, and many successful GATK runs :)

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.