Arm64 specific support for lj_new_str random hash optimization#21
Arm64 specific support for lj_new_str random hash optimization#21debayang wants to merge 1 commit intoopenresty:v2.1-agentzhfrom
Conversation
| lj_str_indep_hash(GCstr *str) { | ||
| return lj_str_original_hash(strdata(str), str->len); | ||
| } | ||
|
|
There was a problem hiding this comment.
style: needs 2 blank lines here like before.
| CCOPT_x64= | ||
| CCOPT_arm= | ||
| CCOPT_arm64= | ||
| CCOPT_arm64= -march=armv8-a+crc |
There was a problem hiding this comment.
I wonder if this breaks compatibility with other arm64 variants?
There was a problem hiding this comment.
The "+crc" option is needed to inform the compiler that crc instruction valid for this arch.
The proper fix might be to check if crc support is available in hardware at runtime .
Modified the code to take care of above.
| v = (v << 8) | len; | ||
| return __crc32cw(h, v); | ||
| } | ||
|
|
There was a problem hiding this comment.
IIRC, on X86-64, if the string being hashed having length under 4, original hash function seem to be working better.
Other than that, LGTM.
There was a problem hiding this comment.
As per my tests on ARM64 platform the micro benchmark did not show any noticeable differences for below 4 bytes.
|
@siddhesh Will you mind having a quick look at this one? ;) Many thanks! |
siddhesh
left a comment
There was a problem hiding this comment.
There isn't a lot of difference in algorithms between the x86_64 and arm64 versions, so it is probably a much better idea to move lj_str_hash_x64.h as lj_str_hash.h and then just add #defines to define the various crc32 macros and the additional platform feature check for arm64.
That will make sure that future maintenance of these functions is easy.
| LJ_STR_HASH = lj_str_hash; | ||
| } | ||
| else { | ||
| LJ_STR_HASH = lj_str_original_hash; |
There was a problem hiding this comment.
Suggest: Initialize LJ_STR_HASH to lj_str_hash and only if HWCAP_CRC32 is not set, reset it to lj_str_original_hash over here.
|
@siddhesh Thanks for looking into this! The original author of this PR is no longer responding it seems. Will you please create a separate PR so that we can merge the final version quickly? Many thanks! |
This ports the random hash optimization done for lj_new_str() for x64 ( in 7923c63 ) to arm64.
Verified on Arm64 platform with the existing benchmark developed as part of the x64 commit above - shows similar improvements as seen for x64 with this benchmark.
However not sure about the effect on any real time workloads.