Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Haplotypecaller taking too long

Hi,
When I run the pipeline according to best practices, HC on a fastq of 60MB (for a targeted panel) takes about 10 minutes, but then for the same pipeline/targeted region, on a fastq of 150MB, HC takes 6 hrs. Any idea what would explain such a stark jump in runtime? Also, is there any way to reduce it? Can I use -Xmx -Xms arguments to increase the speed if memory is not an issue?
Any help will be appreciated.
Thanks!

Best Answer

Answers

  • nitinCelmatixnitinCelmatix NYCMember

    Just to add to it, 60MB and 150MB are the sizes of the gzipped fastq files (.fastq.gz).
    Any way I can diagnose the problem and resolve it?
    Thanks!

  • nitinCelmatixnitinCelmatix NYCMember

    Some more information to help you diagnose:
    The bam file is about 180MB. Here is the command that I use to run HC:

    java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -drf DuplicateRead -R hg19.fa -I sample.bam -o sample.g.vcf -L intervals.bed -ERC GVCF

    Here is the progress (with a smaller bam file that only contains chr5):
    INFO 16:19:56,032 ProgressMeter - chr5:309908 3.45296324E8 30.0 s 0.0 s 26.4% 113.0 s 83.0 s
    INFO 16:20:26,034 ProgressMeter - chr5:422907 3.45438404E8 60.0 s 0.0 s 26.6% 3.8 m 2.8 m
    INFO 16:20:56,036 ProgressMeter - chr5:433905 3.4550422E8 90.0 s 0.0 s 26.6% 5.6 m 4.1 m
    INFO 16:21:26,039 ProgressMeter - chr5:434663 3.4550422E8 120.0 s 0.0 s 26.6% 7.5 m 5.5 m
    INFO 16:21:56,042 ProgressMeter - chr5:435731 3.45523647E8 2.5 m 0.0 s 26.7% 9.4 m 6.9 m
    INFO 16:22:26,045 ProgressMeter - chr5:437668 3.45523647E8 3.0 m 0.0 s 26.7% 11.2 m 8.2 m
    INFO 16:22:56,048 ProgressMeter - chr5:5223924 3.45770196E8 3.5 m 0.0 s 26.8% 13.1 m 9.6 m
    INFO 16:23:26,051 ProgressMeter - chr5:5263102 3.45914532E8 4.0 m 0.0 s 26.9% 14.9 m 10.9 m
    INFO 16:23:56,054 ProgressMeter - chr5:6632886 3.46107621E8 4.5 m 0.0 s 26.9% 16.7 m 12.2 m
    INFO 16:24:26,061 ProgressMeter - chr5:6666877 3.46438901E8 5.0 m 0.0 s 27.0% 18.5 m 13.5 m
    INFO 16:24:56,069 ProgressMeter - chr5:6845386 3.46518376E8 5.5 m 0.0 s 27.1% 20.3 m 14.8 m
    INFO 16:25:26,071 ProgressMeter - chr5:7886097 3.4700375E8 6.0 m 1.0 s 27.2% 22.1 m 16.1 m
    INFO 16:25:56,074 ProgressMeter - chr5:7899978 3.47402902E8 6.5 m 1.0 s 27.3% 23.8 m 17.3 m
    INFO 16:26:26,076 ProgressMeter - chr5:31410796 3.4779354E8 7.0 m 1.0 s 27.4% 25.6 m 18.6 m
    INFO 16:26:56,079 ProgressMeter - chr5:31515040 3.49199777E8 7.5 m 1.0 s 27.6% 27.2 m 19.7 m
    INFO 16:27:26,083 ProgressMeter - chr5:31638977 3.49700888E8 8.0 m 1.0 s 27.7% 28.9 m 20.9 m
    INFO 16:27:56,086 ProgressMeter - chr5:31983210 3.50309791E8 8.5 m 1.0 s 27.8% 30.6 m 22.1 m
    INFO 16:28:26,088 ProgressMeter - chr5:32073889 3.51286139E8 9.0 m 1.0 s 27.9% 32.3 m 23.3 m
    INFO 16:28:56,090 ProgressMeter - chr5:32089045 3.51370877E8 9.5 m 1.0 s 27.9% 34.0 m 24.5 m
    INFO 16:29:26,093 ProgressMeter - chr5:32090409 3.51370877E8 10.0 m 1.0 s 28.0% 35.8 m 25.8 m
    INFO 16:29:56,097 ProgressMeter - chr5:32109485 3.51913846E8 10.5 m 1.0 s 28.1% 37.4 m 26.9 m

    As you can see, it reached the first 26% in 30 secs, but then somehow it got stuck with the next 2% for the next 15 mins or so. So, clearly, there is something in the region that is throwing HC off.

    Any insight/suggestions would be very helpful.

    Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah, that makes a lot of sense. Thanks for reporting your solution, I was just starting to look into this one and am relieved to hear I can stop ;)

Sign In or Register to comment.