no. of cores utilization in haplotypcaller in GVCF mode

Hi,

I am running Haplotypecaller (v4.0.1.2) (not the spark version) on some WGS samples on a SGE (Sun grid Engine) cluster. When I am submitting a job to my cluster, I am asking for 1 core (on an 8 core processor having 1 thread each). I am aware that in native haplotypecaller, I cannot mention the number of cores it should utilize for parallelization and only use --native-pair-hmm-threads to make that step faster (whose default is 4).

Does Haplotypecaller utilize cores according to the availability? I mean if I am assigning 1 core to that job, will it still try to utilize other cores on that processor?

Kindly let me know if you need any more information for clarity.

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @prasundutta87
    Hi,

    If you only assign 1 core to the job, only 1 core will be used, regardless of how many cores are available.

    However, can you post your command? You may also find this dictionary entry on Spark useful.

    -Sheila

  • prasundutta87prasundutta87 EdinburghMember
    edited February 23

    Hi,

    We have a shared oracle grid engine based cluster computing facility at our university and we submit jobs to it and an example command is this-

    !/bin/bash

    Grid Engine options

    $ -N gvcf_maker_30x

    $ -cwd

    $ -M prasundutta87@gmail.com

    $ -m bea

    $ -t 1:37

    $ -pe sharedmem 8

    $ -l h_vmem=2G

    $ -l h_rt=480:00:0

    Initialise the modules framework

    . /etc/profile.d/modules.sh

    module load java/jdk/1.8.0

    Using GATK Version: 4.0.1.2

    animal=head -$SGE_TASK_ID 30x_animals_list.txt | tail -1

    java -Xmx4g -jar gatk-package-4.0.1.2-local.jar HaplotypeCaller -R GCF_000471725.1_UMD_CASPUR_WB_2.0_genomic.fa --native-pair-hmm-threads 8 -I "$animal"_sorted_markduped_readgroup.bam -ERC GVCF -O "$animal".g.vcf

    The problems-

    1) The first problem that is coming is when I am setting --native-pair-hmm-threads 8, haplotypcaller is still using 2 cores instead of all assigned 8 cores. For our system, each core has 1 thread (not 2) and I had an understanding that if I am telling haplotypecaller to use 8 threads for it pairHMM algorithm, it will use all 8 cores. It is not doing that.

    2) The second problem is if I am assigning my qsub script 1 core but am using --native-pair-hmm-threads 16, haplotypecaller is venturing into other cores of the processor jeopardising other jobs sharing the same processor. I expected it still to limit itself to 1 core, which was not the case.

    Is there any explanation to above two cases? Please correct me if I am wrong somewhere in my understanding of how haplotypcaller works with parallelization.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator
    edited March 1

    @prasundutta87
    Hi,

    Sorry for the delay. I am asking someone on the team for help and will get back to you soon.

    -Sheila

    EDIT: This issue may also interest you.

    Post edited by Sheila on
  • prasundutta87prasundutta87 EdinburghMember

    No problem @Sheila..and thanks for sharing the link...

  • prasundutta87prasundutta87 EdinburghMember

    Hi,

    I have this output before gvcf actually starts being made..

    WARNING: We recommend that you use a minimum of 4 GB of virtual memory when running Java 1.8.0_74 on Eddie. Please see the following for details:
    https://www.wiki.ed.ac.uk/display/ResearchServices/Java
    00:10:00.392 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/exports/eddie3_homes_local/s0928794/tools/gatk-package-4.0.1.2-local.jar!/com/intel/gkl/native/libgkl_compression.so
    00:10:00.631 INFO HaplotypeCaller - ------------------------------------------------------------
    00:10:00.632 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.1.2
    00:10:00.632 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
    00:10:00.632 INFO HaplotypeCaller - Executing as s0928794@node2i17.ecdf.ed.ac.uk on Linux v3.10.0-327.36.3.el7.x86_64 amd64
    00:10:00.632 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_74-b02
    00:10:00.633 INFO HaplotypeCaller - Start Date/Time: 20 February 2018 00:10:00 GMT
    00:10:00.633 INFO HaplotypeCaller - ------------------------------------------------------------
    00:10:00.633 INFO HaplotypeCaller - ------------------------------------------------------------
    00:10:00.634 INFO HaplotypeCaller - HTSJDK Version: 2.14.1
    00:10:00.634 INFO HaplotypeCaller - Picard Version: 2.17.2
    00:10:00.634 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
    00:10:00.634 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    00:10:00.634 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    00:10:00.634 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    00:10:00.634 INFO HaplotypeCaller - Deflater: IntelDeflater
    00:10:00.634 INFO HaplotypeCaller - Inflater: IntelInflater
    00:10:00.634 INFO HaplotypeCaller - GCS max retries/reopens: 20
    00:10:00.634 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
    00:10:00.634 INFO HaplotypeCaller - Initializing engine
    00:10:14.256 INFO HaplotypeCaller - Done initializing engine
    00:10:18.515 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
    00:10:18.515 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
    00:10:19.323 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/exports/eddie3_homes_local/s0928794/tools/gatk-package-4.0.1.2-local.jar!/com/intel/gkl/native/libgkl_utils.so
    00:10:19.325 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/exports/eddie3_homes_local/s0928794/tools/gatk-package-4.0.1.2-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
    00:10:19.380 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
    00:10:19.381 INFO IntelPairHmm - Available threads: 16
    00:10:19.381 INFO IntelPairHmm - Requested threads: 8
    00:10:19.381 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation

  • LouisBLouisB Broad InstituteMember, Broadie, Dev

    Hmn, that definitely looks like the threading should be working. I wouldn't expect it to saturated 8 cores, because pairHMM has a diminishing fraction of the total runtime as you add more and more threads, but I would expect to see more than 2 cores used. The pairhmm is only a fraction of the total runtime, so we expect to see diminishing returns as you increase the threading.

    If you want to use your cluster efficiently, agood idea would be to test with different values for threading and see if it gives you a useful speedup. I expect you to get much better cluster utilization by using more separate processes with lower parallelization rather than a few with high parallelization.

    I would try running on your system with threads = 1,2,3,4, 8 and seeing what the runtime is on the same bam. Then you can choose how you shard things based on that result. I would expect reasonable speedup from 1 -> 2 -> 4 but you'll probably see pretty rapid diminishing returns.

  • prasundutta87prasundutta87 EdinburghMember

    Thanks @LouisB..I will try this at my end.

  • kaboroevichkaboroevich TokyoMember
    edited March 30

    @prasundutta87

    Regarding your problem #2, I'm no expert, but for grid engines I've used, the number of slots requested by qsub is only for scheduling. All jobs will have access to all CPUs on a node, and it's the responsibility of the submitter/job not use more than requested.

    One way to ensure this is to use the grid engine environmental variable $NSLOTS to set the number of cores, rather than hard coding it. For example:

    java -Xmx4g -jar gatk-package-4.0.1.2-local.jar HaplotypeCaller -R GCF_000471725.1_UMD_CASPUR_WB_2.0_genomic.fa --native-pair-hmm-threads ${NSLOTS:-1} -I "$animal"_sorted_markduped_readgroup.bam -ERC GVCF -O "$animal".g.vcf

    Using ${NSLOTS:-1} rather than $NSLOTS will set it to a default of one in case the number of slots wasn't defined.

    I also noticed you're requesting only 2G of memory from the grid (-l h_vmem=2G) but allowing java to access 4G (-Xmx4g). You may want to change that as well.

  • prasundutta87prasundutta87 EdinburghMember

    @kaboroevich

    Thanks a lot for the tip. It is very helpful.

    For the second part, I am requesting 2G memory but am also requesting 8 cores..so my total memory becomes 8*2=16G..

Sign In or Register to comment.