wrong candidate haplotype chosen by HaplotypeCaller

cookethocooketho Member Posts: 11
edited January 2013 in Ask the GATK team

I've been experiencing some apparent errors with HaplotypeCaller that I think could be related to how it chooses candidate haplotypes when performing multi-sample calling. Please see the example files I've uploaded to the server (cooketho_20130103.tar.gz). For instance if you look at position 3511 in sample 2, there are 14 non-reference reads and 0 reference reads. When HaplotypeCaller is run with just this sample, it calls this locus homozygous non-reference, which seems to me to be the correct behavior. But when run with all 14 samples, it doesn't call a SNP at this locus. Repeating the run in debug mode shows that the (immediate) cause is that there were 11 candidate haplotypes found, and not a single one of them had the non-reference allele at position 3511. Why?

I came across an earlier post that suggested in some cases increasing the --minPruning value can be of use, but I tried this to no avail.


My organism is a plant, and is is considerably more heterozygous than human, but changing the --heterozygosity value did not appear to help either. Double check me on this if you like.

Can you please suggest a fix, or perhaps release some documentation on how HaplotypeCaller selects candidate haplotypes?

P.S. Any idea of when the source will be released to the public, or when a more comprehensive manual will be released? Would be very helpful for figuring out what is going on in cases like this.


Post edited by Geraldine_VdAuwera on

Best Answers


  • cookethocooketho Member Posts: 11

    Thanks Ryan!
    When will a version of GATK be available that has this new option? Would it be possible to send me a copy of the internal development version in the interim--just for testing purposes?

Sign In or Register to comment.