IntelDeflater problem on Kernel Update

Dear members of the GATK Team,

we ran into a curious problem after updating the Kernel of our machines from 4.4 to 4.12.
Our pool of computing nodes has Xeon E5-2697 v4 and Xeon Gold 6148 CPUs.

In the machines equipped with the latter CPU (Xeon 6148) the GATK HaplotypeCaller and other tools using IntelDeflater stopped working, hanging just before printing the message:

INFO 15:41:08,160 GenomeAnalysisEngine - Deflater: IntelDeflater
INFO 15:41:08,161 GenomeAnalysisEngine - Inflater: IntelInflater

If I set use-jdk-inflater (and deflater) they go through, but terribly slow. On the machines with E5-2697 (older CPU) everything works fine.

This problem occurs using GATK 3.8, while doesn't appear into GATK4. Is there a way to fix that other than updating? We wanted to be consistent for now with analysed data.

Thanks for the support!

Riccardo

Answers

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    Kernel 4.12 does not seem to be a LTS kernel. Can't you update them to 4.14 or 4.19 LTS kernel?

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    Hi @berutti

    I'm guessing that GATK can't find the gkl_compression library on which the IntelDeflator is based. Looking at the code I'm guessing it's not in the LD_LIBRARY_PATH anymore for some reason.

    Can you please provide us with a full log file and the invocation line?

    I suggest running with -pairHmm VECTOR_LOGLESS_CACHING and seeing if that works.

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    edited January 17

    Also the old versions of the GKL code had a memory leak issue. Can you try GATK version 3.8-1 to see if the issue persists?

    Also can you try this parameter for the java VM when using GATK 3.8 or 3.8-1

    -Djava.io.tmpdir=/path/to/any/dir/where/you/have/permission/to/read/write
    
  • beruttiberutti Member

    First of all thanks for the suggestions and hints!
    For now I had no luck with them.

    @SkyWarrior said:
    Kernel 4.12 does not seem to be a LTS kernel. Can't you update them to 4.14 or 4.19 LTS kernel?

    Nope, our machines are based on Suse Server v15 and 4.12.x is the latest and only available kernel for SLES15. Nervertheless the problem is related to the CPU+kernel combination

    @SkyWarrior said:
    Also the old versions of the GKL code had a memory leak issue. Can you try GATK version 3.8-1 to see if the issue persists?

    Also can you try this parameter for the java VM when using GATK 3.8 or 3.8-1

    -Djava.io.tmpdir=/path/to/any/dir/where/you/have/permission/to/read/write
    

    It has no effect, this was set correctly already

    @bhanuGandham said:
    Hi @berutti

    I'm guessing that GATK can't find the gkl_compression library on which the IntelDeflator is based. Looking at the code I'm guessing it's not in the LD_LIBRARY_PATH anymore for some reason.

    Can you please provide us with a full log file and the invocation line?

    I suggest running with -pairHmm VECTOR_LOGLESS_CACHING and seeing if that works.

    In principle we never installed it, since the library is packed into the Gatk Jar. Indeed we tried to setup the GKL libraries and even repack a fresh compiled library into the GATK jar but it didn't work. The -pairHMM option is not available, and indeed, the problem appears also on GenotypeGVCFs.
    Our invocation line works with the machines with a different kernel or an older cpu (and the latest kernel):

    /opt/jre1.8.0/bin/java -XX:ParallelGCThreads=1 -Xmx4g -jar /opt/software/GenomeAnalysisTK-3.8/GenomeAnalysisTK.jar -T HaplotypeCaller -R /data/genomes/human/hg19_decoy/fasta/hg19_decoy.fa -I /data/service/testdata/merged.rmdup.bam -nct 1 -A DepthPerAlleleBySample -A InbreedingCoeff -A HaplotypeScore --logging_level INFO -U ALLOW_N_CIGAR_READS -L /data/kits/agilent/exome60MbV6/gatk_targets.bed --emitRefConfidence BP_RESOLUTION --variant_index_type LINEAR --variant_index_parameter 128000 -o /data/service/tests/test_20190115/testvcf.vcf.gz >> /data/service/tests/test_20190115/testvcf.log

    And that's a just produced logging:
    INFO 11:38:10,548 HelpFormatter - ----------------------------------------------------------------------------------
    INFO 11:38:10,550 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.8-0-ge9d806836, Compiled 2017/07/28 21:26:50
    INFO 11:38:10,550 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
    INFO 11:38:10,550 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
    INFO 11:38:10,550 HelpFormatter - [Fri Jan 18 11:38:10 CET 2019] Executing on Linux 4.12.14-25.25-default amd64
    INFO 11:38:10,550 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_65-b17
    INFO 11:38:10,554 HelpFormatter - Program Args: -T HaplotypeCaller -R /data/genomes/human/hg19_decoy/fasta/hg19_decoy.fa -I /data/service/testdata/merged.rmdup.bam -nct 1 -A DepthPerAlleleBySample -A InbreedingCoeff -A HaplotypeScore --logging_level INFO -U ALLOW_N_CIGAR_READS -L /data/kits/agilent/exome60MbV6/gatk_targets.bed --emitRefConfidence BP_RESOLUTION --variant_index_type LINEAR --variant_index_parameter 128000 -o /data/service/tests/test_20190115/testvcf.vcf.gz
    INFO 11:38:10,557 HelpFormatter - Executing as [email protected] on Linux 4.12.14-25.25-default amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_65-b17.
    INFO 11:38:10,557 HelpFormatter - Date/Time: 2019/01/18 11:38:10
    INFO 11:38:10,557 HelpFormatter - ----------------------------------------------------------------------------------
    INFO 11:38:10,557 HelpFormatter - ----------------------------------------------------------------------------------
    WARN 11:38:10,561 GATKVCFUtils - Naming your output file using the .g.vcf extension will automatically set the appropriate values for --variant_index_type and --variant_index_parameter
    WARN 11:38:10,561 GATKVCFUtils - Creating Tabix index for /data/service/tests/test_20190115/testvcf.vcf.gz , ignoring user-specified index type and parameter
    ERROR StatusLogger Unable to create class org.apache.logging.log4j.core.impl.Log4jContextFactory specified in jar:file:/opt/software/GenomeAnalysisTK-3.8/GenomeAnalysisTK.jar!/META-INF/log4j-provider.properties
    ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...

  • bhanuGandhambhanuGandham Member, Administrator, Broadie, Moderator admin

    HI @berutti

    Can you please use jstack to get a stack frame while it's hanging and post it?

    Also we have created a github issue for this in the GKL repo. You can follow this issue here: https://github.com/Intel-HLS/GKL/issues/98
    Posting in this thread might help.

Sign In or Register to comment.