We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Variant types confusion

santiagorevalesantiagorevale ArgentinaMember

Dear GATK team,

I'm a bit confused about the term MIXED (and maybe SYMBOLIC), because I believe it's being differently used among softwares.
If I understand correctly from the FAQ "What types of variants can GATK tools handle?" we have:

  • MIXED (combination of SNPs and indels at a single position)
    E.g. Reference = 'T', Sample = 'A,TCC'
    Here, we say it's MIXED because it combines 2 variant types (SNP, INS) for this position; we are talking about two possible alleles.

  • SYMBOLIC (generally, a very large allele or one that's fuzzy and not fully modeled; i.e. there's some event going on here but we don't know what exactly)
    E.g. Reference = 'GC', Sample = 'TTA'
    Is this example correctly classified for what SYMBOLIC stands for?

In the other hand, I've been using SnfSift (from SnpEff package) to filter variants, but when I tried to grab what I understood MIXED variants were, I've got a different result as oppose to using GATK. While checking its manual, I found what seems to be a different definition for MIXED:

  • MIXED: Multiple-nucleotide and an InDel.
    E.g. Reference = 'ATA', Sample = 'GTCAGT'

I believe SnpEff MIXED definition of variant type is equivalent to GATKs SYMBOLIC definition, am I right?

I've been told one thing is a) "MIXED variant" and another b) "MIXED variant call record". GATK is using MIXED as b) while SnpEff is using it as a).

Is there an official definition for these stuff? Are any of these softwares wrong?

Thank you very much for your help.



  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Santiago,

    I tried it out, and it looks like GATK follows the VCF spec format. Have a look at the spec for more information: http://www.1000genomes.org/wiki/analysis/variant call format/vcf-variant-call-format-version-41


  • tommycarstensentommycarstensen United KingdomMember ✭✭✭

    @santiagorevale just a small comment; UG emits SNPs and indels at the same position separately, whereas HC emits them as one VCF record. There are many tools to split your multiallelic sites into biallelic sites.

  • santiagorevalesantiagorevale ArgentinaMember

    Hi Sheila,

    I'm still confused.

    When you talk about alleles in the above mention FAQ, what's the difference between MIXED and SYMBOLIC? Could you give me an example?

    Are these two examples correct?

    • MIXED: e.g. Reference = 'T', Sample = 'A,TCC'
    • SYMBOLIC: e.g. Reference = 'GC', Sample = 'TTA'

    Thanks in advance.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    edited July 2015

    The first is correct. The second is not -- it's a MNP or complex substitution. A symbolic allele would be something like * or <NONREF>, where the allele is not an actual representation of nucleotides, but instead, a symbol that represents an allele that is only partly determined if at all.

    Post edited by Geraldine_VdAuwera on
  • santiagorevalesantiagorevale ArgentinaMember

    Thanks, Geraldine.

    So how would GATK classify this second example? Because it's not an MNP (it should be the same number of nucleotides) but it's more like a complex substitution.

    Is there a name for this type of complex substitutions? Should it be documented in any specification? SnpEff calls this type of variant a MIXED variant but I couldn't find any source that controls this vocabulary. Am I missing it?

    Thanks again.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hmm, it's like a mix of a MNP and an insertion (so complex substitution is an appropriate catch-all name). I think GATK would probably consider it MIXED but I'm not sure, you'd have to test e.g. SelectVariants on it with the variant type argument.

    If anyone controls this vocabulary it's GA4GH and the hts-spec group: https://github.com/samtools/hts-specs

Sign In or Register to comment.