What does each data thread stand for in HaplotypeCaller

I'm using multi-threading for HaplotypeCaller by setting the nct option.
But actually, I found that the speedup it gains isn't in proportional to the increase of the number of data threads.
I tried nct as 8,12,16,24 on my machine, and gained a speedup of 4.1x, 4.2x, 4.2x, 4.2x. Seems that there is an upper bound of performance gains when enabling mult-threading for HaplotypeCaller.

I'm wondering what each data thread stands for in HaplotypeCaller. We need to use PairHMM to calculate the likelihood array in each active region. Are we distributing each read-haplotype pair in the region as one data thread and map it to a CPU thread? Or are we distributing the calculation in each region as one data thread?



  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    Have a look at this article which should help.

    Also, look at Geraldine's response from February 25th in this thread.


    P.S. For a more high level overview of multi-threading, have a look at this article.

  • whbldhwjwhbldhwj LA,CAMember

    Thanks, @Sheila , actually I've looked through the latter two threads before. It seems that nt is for task-level and nct is for CPU-level. For example, in Haplotypecaller, we have several active regions. In each region, we need to compute several read-haplotype pairs using PairHMM. It seems reasonable to assume that by setting nct as a number bigger than 1, we are using multiple CPU threads for each active region. Therefore, insufficient read-haplotype pairs in each region will result in underutilization of multi-threading. Therefore, we will have an upper bound of speedup with the increase of CPU threads. I just want to make sure that my understanding is correct.
    By the way, this Intel page cannot answer my question, either.


    Issue · Github
    by Sheila

    Issue Number
    Last Updated
    Closed By
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @whbldhwj, the way multithreading is implemented in HaplotypeCaller is not transparent -- there's a lot going on in there and it's not as simple as just dispatching read-haplotype pairs in separate threads. To be frank the architecture of the HaplotypeCaller is awfully complicated and is in the process of being rewritten in a saner framework. As a result this is not something we currently provide end-user documentation for. If you're interested in implementation details you'll have to look at the code yourself -- but you'd be better off saving yourself the hassle and looking into scatter-gather to parallelize HaplotypeCaller instead (or as a complement to) multithreading.

Sign In or Register to comment.