The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.2 is now available at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

wrong candidate haplotype chosen by HaplotypeCaller

cookethocooketho Member
edited January 2013 in Ask the GATK team

I've been experiencing some apparent errors with HaplotypeCaller that I think could be related to how it chooses candidate haplotypes when performing multi-sample calling. Please see the example files I've uploaded to the server (cooketho_20130103.tar.gz). For instance if you look at position 3511 in sample 2, there are 14 non-reference reads and 0 reference reads. When HaplotypeCaller is run with just this sample, it calls this locus homozygous non-reference, which seems to me to be the correct behavior. But when run with all 14 samples, it doesn't call a SNP at this locus. Repeating the run in debug mode shows that the (immediate) cause is that there were 11 candidate haplotypes found, and not a single one of them had the non-reference allele at position 3511. Why?

I came across an earlier post that suggested in some cases increasing the --minPruning value can be of use, but I tried this to no avail.

http://gatkforums.broadinstitute.org/discussion/1764/haplotypecaller-in-cohorts

My organism is a plant, and is is considerably more heterozygous than human, but changing the --heterozygosity value did not appear to help either. Double check me on this if you like.

Can you please suggest a fix, or perhaps release some documentation on how HaplotypeCaller selects candidate haplotypes?

P.S. Any idea of when the source will be released to the public, or when a more comprehensive manual will be released? Would be very helpful for figuring out what is going on in cases like this.

Thanks!
Tom

Post edited by Geraldine_VdAuwera on

Best Answers

Answers

  • Thanks Ryan!
    When will a version of GATK be available that has this new option? Would it be possible to send me a copy of the internal development version in the interim--just for testing purposes?
    Tom

Sign In or Register to comment.