The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at

Expected file size - Haplotype Caller

bvecchiobvecchio Member Posts: 7
edited October 2012 in Ask the GATK team

Hi All,
I've been attempting to use the haplotype caller on my 50x coverage exome data. The bam being parsed is about 12G. Each time, the caller runs for many hours and then the output is only the header of the VCF - no errors seen. I'm wondering if this is due to limited space on my drives or if the expected file size is much larger than I am anticipating.


GenomeAnalysisTK.jar -T HaplotypeCaller -R  Homo_sapiens_assembly19.fasta -I input.bam --dbsnp dbsnp_132.b37.nochr.vcf  -stand_call_conf 30 -stand_emit_conf 10 -o output.Haplotypes.vcf


  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,124 admin

    Hi there,

    Can you clarify -- is the run completing "successfully" (the program finishes without error message) or are you saying you're not seeing anything when you look in the file after some hours but before the run is complete?

    Geraldine Van der Auwera, PhD

  • bvecchiobvecchio Member Posts: 7

    The run completes successfully, without errors. I view the file after completion, and there is no data other than the header.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,124 admin

    I see. Have you tried running on a subset of your file? Since you're working with an exome, I assume you have target intervals? You can pass those with -L; if your problem is due to file size that may work around it.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.