To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Haplotype Caller: too many alternative alleles found?

Hello gatk team,
I am running HaplotypeCaller on 5 files of genomic alignments together at once.
Despite the fact that I did a InDel realignment (Indel Realigner gatk) before running these files with HaplotypeCaller, I get many warnings during the process that say "too many alternative alleles found", with sometimes 10, 12 or 13 alternative alleles found.
Is that normal? or is there a step that I could have done improperly?
Thank you for your help :)
Marvin

Tagged:

Best Answer

  • SheilaSheila Broad InstituteMember, Broadie, Moderator
    Accepted Answer

    @mac
    Hi Marvin,

    It is possible to have many alternate alleles at a site. They usually occur in very messy regions or at sites where the samples are very divergent. We usually set --max_alternate_alleles 6. If you are running on 5 samples at once or on non-diploid samples, you may consider setting the default to a number higher than 6.

    You can also take a look at the sites that are in the WARN statements and check in IGV if they are messy sites. Those sites may not be worth increasing the max_alternate_alleles for, as the tool will not be able to produce confident calls at those sites.

    -Sheila

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator
    Accepted Answer

    @mac
    Hi Marvin,

    It is possible to have many alternate alleles at a site. They usually occur in very messy regions or at sites where the samples are very divergent. We usually set --max_alternate_alleles 6. If you are running on 5 samples at once or on non-diploid samples, you may consider setting the default to a number higher than 6.

    You can also take a look at the sites that are in the WARN statements and check in IGV if they are messy sites. Those sites may not be worth increasing the max_alternate_alleles for, as the tool will not be able to produce confident calls at those sites.

    -Sheila

Sign In or Register to comment.