The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.

Should I use UnifiedGenotyper or HaplotypeCaller to call variants on my data?

Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,690 admin
edited October 2014 in FAQs

Use HaplotypeCaller!

The HaplotypeCaller is a more recent and sophisticated tool than the UnifiedGenotyper. Its ability to call SNPs is equivalent to that of the UnifiedGenotyper, its ability to call indels is far superior, and it is now capable of calling non-diploid samples. It also comprises several unique functionalities such as the reference confidence model (which enables efficient and incremental variant discovery on ridiculously large cohorts) and special settings for RNAseq data.

As of GATK version 3.3, we recommend using HaplotypeCaller in all cases, with no exceptions.

Caveats for older versions

If you are limited to older versions for project continuity, you may opt to use UnifiedGenotyper in the following cases:

  • If you are working with non-diploid organisms (UG can handle different levels of ploidy while older versions of HC cannot)

  • If you are working with pooled samples (also due to the HC’s limitation regarding ploidy)

  • If you want to analyze more than 100 samples at a time (for performance reasons) (versions 2.x)
Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,690 admin

    This document has been updated to reflect status as of GATK version 3.3. Older comments and questions have been moved to this archival thread: http://gatkforums.broadinstitute.org/discussion/4744/questions-about-using-ug-vs-hc-out-of-date

    Geraldine Van der Auwera, PhD

  • yd44@duke.eduyd44@duke.edu Member Posts: 1

    Hi, since GATK has version 3.5 now, which one would you suggest in this version? Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,690 admin

    Definitely HaplotypeCaller; this will apply to all future versions unless otherwise stated.

    Geraldine Van der Auwera, PhD

  • thedamthedam BarcelonaMember Posts: 2

    Hi,
    If I use HaplotypeCaller, is it necessary to make steps: RealignerTargetCreator, IndelRealigner, BaseRecalibrator?
    Somewhere I've read that now HaplotypeCaller can be applied right after MarkDuplicates. Is it true?
    Ps. What should be applied: MarkDuplicatesWithMateCigar or MarkDuplicates?
    thx

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,284 admin

    @thedam
    Hi,

    I'm not sure where you read that you can skip all those steps! We still recommend those, as they are indeed important. Have a look at the Best Practices for more information.

    This article should help with marking duplicates.

    -Sheila

  • thedamthedam BarcelonaMember Posts: 2

    Hi, thanks for the answewr!
    Well, I've read it here:
    https://www.broadinstitute.org/gatk/events/slides/1506/GATKwr8-B-2-Indel_realignment.pdf
    but maybe it didn't understand it correctly. Slide 25:

    "Is realignment still necessary with latest software?

    Latest tools being implemented for variant discovery
    (HaplotypeCaller, MuTect 2, Platypus) all include some
    sort of assembly step (for which upstream realignment is
    not really helpful). "

    As I understand, HaplotypeCaller does it's own 'realignment' (by assembling reads that are around the region) so IndelRealigner is not needed.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,690 admin

    @thedam you're right that indel realignment is not as important anymore, but we still recommend running it (and we do so in our production pipeline) because we think it may still improve results of BaseRecalibrator by removing mismatches that would otherwise constitute noise.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.