Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

wrong candidate haplotype chosen by HaplotypeCaller

cookethocooketho Posts: 11Member
edited January 2013 in Ask the GATK team

I've been experiencing some apparent errors with HaplotypeCaller that I think could be related to how it chooses candidate haplotypes when performing multi-sample calling. Please see the example files I've uploaded to the server (cooketho_20130103.tar.gz). For instance if you look at position 3511 in sample 2, there are 14 non-reference reads and 0 reference reads. When HaplotypeCaller is run with just this sample, it calls this locus homozygous non-reference, which seems to me to be the correct behavior. But when run with all 14 samples, it doesn't call a SNP at this locus. Repeating the run in debug mode shows that the (immediate) cause is that there were 11 candidate haplotypes found, and not a single one of them had the non-reference allele at position 3511. Why?

I came across an earlier post that suggested in some cases increasing the --minPruning value can be of use, but I tried this to no avail.

http://gatkforums.broadinstitute.org/discussion/1764/haplotypecaller-in-cohorts

My organism is a plant, and is is considerably more heterozygous than human, but changing the --heterozygosity value did not appear to help either. Double check me on this if you like.

Can you please suggest a fix, or perhaps release some documentation on how HaplotypeCaller selects candidate haplotypes?

P.S. Any idea of when the source will be released to the public, or when a more comprehensive manual will be released? Would be very helpful for figuring out what is going on in cases like this.

Thanks! Tom

Post edited by Geraldine_VdAuwera on

Best Answers

Answers

Sign In or Register to comment.