libVectorLoglessPairHMM is not present in GATK 3.8 - HaplotypeCaller is slower than 3.4-46!

mnw21cammnw21cam Exeter UniversityMember
edited November 2017 in Ask the GATK team

We are running GATK on a multi-core Intel Xeon that does not have AVX. We have just upgraded from running 3.4-46 to running 3.8, and HaplotypeCaller runs much more slowly. I noticed that our logs used to say:

Using SSE4.1 accelerated implementation of PairHMM
INFO 06:18:09,932 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked successfully from GATK jar file
INFO 06:18:09,933 VectorLoglessPairHMM - Using vectorized implementation of PairHMM

But now they say:

WARN 07:10:21,304 PairHMMLikelihoodCalculationEngine$1 - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
WARN 07:10:21,310 PairHMMLikelihoodCalculationEngine$1 - AVX-accelerated native PairHMM implementation is not supported. Falling back to slower LOGLESS_CACHING implementation

I'm guessing the newfangled Intel GKL isn't working so well for us. Note that I had a very similar problem with GATK 3.4-0, in http://gatk.vanillaforums.com/entry/passwordreset/21436/OrxbD0I4oRDaj8y1hDSE and this was resolved in GATK 3.4-46.

Issue · Github
by Sheila

Issue Number
2709
State
closed
Last Updated
Assignee
Array
Closed By
vdauwera

Answers

  • mnw21cammnw21cam Exeter UniversityMember

    Sorry, posted the wrong link. Should be https://gatkforums.broadinstitute.org/gatk/discussion/5611/gatk-3-4-seems-much-slower-than-gatk-3-3

    (At least the link invalidates after use!)

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @mnw21cam
    Hi,

    I will check with the team and get back to you.

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi @mnw21cam, we've had a few hiccups with the latest cuts of the GKL; a few bugs have been found & fixed so it's worth grabbing the latest nightly to see if that goes through alright. There's one more that was caught but the PR is still in review -- though that one causes a segfault so I'm sure you'd have noticed if you hit that ;)

  • mnw21cammnw21cam Exeter UniversityMember

    Latest nightly still says:

    INFO  13:28:06,650 NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/usr/share/gatk/GenomeAnalysisTK-3.8-nightly-2017-12-05-1/GenomeAnalysisTK.jar!/com/intel/gkl/native/libgkl_utils.so 
    WARN  13:28:06,667 PairHMMLikelihoodCalculationEngine$1 - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported 
    WARN  13:28:06,668 PairHMMLikelihoodCalculationEngine$1 - AVX-accelerated native PairHMM implementation is not supported. Falling back to slower LOGLESS_CACHING implementation 
    

    Matthew

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Matthew, then it sounds like we'll need you to retest with the new patch when it's ready. Sorry about all that trouble. Out of curiosity, have you tested whether the latest GATK4 beta exhibits the same behavior?

  • mnw21cammnw21cam Exeter UniversityMember

    With GATK 4.beta.5, I get the following in the logs:

    16:00:12.749 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/usr/share/gatk/GenomeAnalysisTK-4.beta.5/gatk-4.beta.5/gatk-package-4.beta.5-local.jar!/com/intel/gkl/native/libgkl_utils.so
    16:00:12.751 INFO PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
    16:00:12.752 WARN PairHMM - ***WARNING: Machine does not have the AVX instruction set support needed for the accelerated AVX PairHmm. Falling back to the MUCH slower LOGLESS_CACHING implementation!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @mnw21cam
    Hi Matthew,

    Ah, I see you got the beta 5 from the Downloads. But, there is a beta 6 here. Sorry for the hassle. I hope your issue is resolved in the beta 6 :smile:

    -Sheila

  • eykim909eykim909 chicagoMember

    I am running HaplotypeCaller with GATK v4.0.0.0. I get the same warning in the logs. Any solutions?
    Thanks!
    E

    14:11:38.500 INFO PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
    14:11:38.500 WARN PairHMM - ***WARNING: Machine does not have the AVX instruction set support needed for the accelerated AVX PairHmm. Falling back to the MUCH slower LOGLESS_CACHING implementation!

  • SkyWarriorSkyWarrior TurkeyMember

    Can you try GATK3.7? I believe that version still supports SSE4.1 accelerated pairHMM for older CPUs.

    Looks like SSE4.1 acceleration support is dropped at later versions of the GKL library thats why you cannot get any acceleration on your system.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @eykim909
    Hi,

    I am checking with the team, and someone will get back to you soon.

    -Sheila

  • RebsRebs Member

    Hi,

    I am having the same problem using GATK v4.0.1.0.

    10:47:08.509 INFO PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported
    10:47:08.509 WARN PairHMM - ***WARNING: Machine does not have the AVX instruction set support needed for the accelerated AVX PairHmm. Falling back to the MUCH slower LOGLESS_CACHING implementation!

    Are there any updates yet?
    Thanks!

  • SkyWarriorSkyWarrior TurkeyMember

    What is your machine specs?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @eykim909 @Rebs
    Hi,

    Sorry for the delay. I pinged the team again.

    -Sheila

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Rebs @eykim909
    Hi,

    It turns out your CPU doesn't support AVX, and so you cannot run the accelerated PairHMM (which is why the GATK falls back to using the Java version).

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    FYI the 3.8-1-0-gf15c1c3ef patch release is now available in the archive downloads; it solves the issues that have been observed so far with GKL.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Rebs @eykim909
    Hi again,

    To add to Geraldine's comment, I have this also from the developers.

    "GATK 3 had a mode that supported SSE acceleration which isn't as good as AVX. We didn't port that because the vast majority of computers have AVX now. You might have a computer that relied on that mode, which is no longer supported."

    -Sheila

  • RebsRebs Member

    Hi @Sheila and @Geraldine_VdAuwera

    Thank you for your reply.
    I am not an expert in the field so I am still starting to learn and trying to understand. Does the fact that it uses the Java version instead of AVX suppose a problem?

    Thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    @Rebs, not scientifically, no -- the results may not be identical but they will be functionally equivalent. However, the Java version will be slower.

  • @Sheila said:
    @Rebs @eykim909
    Hi again,

    To add to Geraldine's comment, I have this also from the developers.

    "GATK 3 had a mode that supported SSE acceleration which isn't as good as AVX. We didn't port that because the vast majority of computers have AVX now. You might have a computer that relied on that mode, which is no longer supported."

    -Sheila

    Damn, using a group of Intel Xeon E7- 4860 here. Not all people have lots of money you know.
    Thanks anyways !

  • Hi everybody,
    I support the Andres Ribone comment. Many pretty old clusters still are glorious in their infrastructure. Switching to an "ALL OR NOTHING" AVX version is a choice that penalizes many of us.

    SSE4.1 or 4.2 are still available and good enough to run the majority of the available pipelines and tools with pretty good performances. So that choice is not fully understandable to me.

    I always consider very good implementing new features and improve performances with algorithm optimizations that rely upon new CPU extensions, but is there a good reason in abandon still good old CPU extensions like SSE4.1 o SSE4.2?

    I'd like to have also an option to build a version that is SSE4 compatible, in order to have the possibility to run it, as faster as it could run on my hardware (with some manual tasks, like building a custom version, disabling AVX and enabling SSE4 extensions).

    Thank you for your attention.

    Best.
    &

    Issue · Github
    by Sheila

    Issue Number
    3132
    State
    open
    Last Updated
    Assignee
    Array
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    @andreazauli @AndresRibone These choices are made based mainly on what development resources we have available (which are not infinite) and the requests that are made by the community (which seem pretty darn infinite some days) so unfortunately we can't make everybody happy. However, the system we use to swap out these extensions is pluggable so you're absolutely free to develop your own (including porting the originals) and plug them in at runtime. For example, the IBM POWER8 team provides their users with native code libraries that are accelerated for their chips. We don't distribute them as part of the official GATK because we don't have the infrastructure to test them and so it would be impossible for us to say for any new version whether that code still works or not. But like I said, anyone could provide that for the community. Happy to provide pointers to how to make it work.

  • Dear Geraldine,
    thank you for the quick reply. I understand your point and I know, resources are not infinite. If you do not plan to port old libraries into the new GATK4, I think the only way is to port them by our own.

    I'd like (if you will have time) to have some pointers:

    • code places, either for gatk4 and gatk3,
    • which latest version of GATK3 (3.7?) still have SSE4 extensions,
    • and some hints in where to look or some documentation if available.

    If you will not have time to give us such hints, at least just the minimum you think it could be useful to start.

    Many thanks.
    &

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @andreazauli
    Hi,

    Sorry for the delay. I will ask Geraldine to get back to you.

    -Sheila

Sign In or Register to comment.