The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
Last chance to register for the GATK workshop next week in Basel, Switzerland! http://www.sib.swiss/training/upcoming-training-events/training/gatk-workshop-lecture

Significantly different runtimes for HaplotypeCaller on same BAMs & region

trgalltrgall Posts: 13Member

I have a region which worked fine, but now doesn't seem to work. Here is a part of the run using version 2.5
17:66743059-81195210.it1hc.stdout-INFO 21:59:30,131 ProgressMeter - 17:73500213
17:66743059-81195210.it1hc.stdout-INFO 22:00:30,949 ProgressMeter - 17:73500549
17:66743059-81195210.it1hc.stdout-INFO 22:01:31,026 ProgressMeter - 17:73500549
17:66743059-81195210.it1hc.stdout-INFO 22:02:31,045 ProgressMeter - 17:73500549
17:66743059-81195210.it1hc.stdout:INFO 22:03:31,063 ProgressMeter - 17:73500593
17:66743059-81195210.it1hc.stdout-INFO 22:04:31,098 ProgressMeter - 17:73501295

As you can see it made it by 17:73500549 in about 3 minutes. I am rerunning the same BAMs in the same region with version 2.6, and now it has been stuck on 17:73500549 for >12 hours. Even when rerunning the same version I have noticed that usually runs take about the same time, but every so often are orders of magnitude longer. I am using -dcov 200, so I know there is some sampling variation (although in this region everyone is between 50-100 reads coverage), but a difference of 3 minutes to > 600 minutes seems excessive.

Any suggestions on making runtimes for regions more predictable?

Tim

Tagged:

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,469Administrator, Dev admin

    Hmm, that does seem excessive. Could you please try re-running that region with the latest nightly build? If you still see problematic runtimes, we'll ask for a test file that reproduces the problem, so we can debug it locally and find out what is causing the slowdown.

    Geraldine Van der Auwera, PhD

  • ebanksebanks Broad InstitutePosts: 698Member, Administrator, Broadie, Moderator, Dev admin

    How many samples/BAMs are you running with?

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • trgalltrgall Posts: 13Member

    ~80 samples with ~110 BAMs.

Sign In or Register to comment.