Tracking issue for a specific perf gap found while comparing sjsonnet (native, master) against jrsonnet (master). Parent comparison: #666.
Observation
std.stripChars / std.lstripChars / std.rstripChars are collectively 1.58× slower than jrsonnet.
Scenario: bench/resources/cpp_suite/bench.09.jsonnet — strips a long ASCII string (~1 KB of 'e' + 'ok') with all three strip variants.
|
mean |
min |
| sjsonnet (native) |
6.0 ± 0.6 ms |
5.4 ms |
| jrsonnet |
3.8 ± 0.7 ms |
2.8 ms |
Repro:
hyperfine --warmup 2 --runs 10 -N \
"sjsonnet bench/resources/cpp_suite/bench.09.jsonnet" \
"jrsonnet bench/resources/cpp_suite/bench.09.jsonnet"
Code
sjsonnet/src/sjsonnet/stdlib/StringModule.scala:270-420 — strip implementations. Scans char-by-char, checking each against the strip set (typically a String of chars to strip).
Hypothesis
- The strip-set membership check is O(|strip_set|) per char on long inputs, unless already optimized.
- Even when optimized, char-by-char iteration on a 1 KB input = ~1000 iterations × 3 variants.
- jrsonnet works on UTF-8
&[u8] bytes with a bitmap-style ASCII check.
Directions
- For ASCII-only strip sets (which cover the vast majority of real usage), build a 256-entry
Array[Boolean] mask once and index it by byte. Should collapse the inner loop to a single load + compare.
- If the input itself is
asciiSafe (tracked on Val.Str), work on the byte array directly without String.charAt virtual dispatch.
- Combine lstrip + rstrip in
stripChars into a single pass from both ends instead of two full scans.
Part of the jrsonnet-parity effort tracked in #666.
Tracking issue for a specific perf gap found while comparing sjsonnet (native, master) against jrsonnet (master). Parent comparison: #666.
Observation
std.stripChars/std.lstripChars/std.rstripCharsare collectively 1.58× slower than jrsonnet.Scenario:
bench/resources/cpp_suite/bench.09.jsonnet— strips a long ASCII string (~1 KB of'e'+'ok') with all three strip variants.Repro:
Code
sjsonnet/src/sjsonnet/stdlib/StringModule.scala:270-420— strip implementations. Scans char-by-char, checking each against the strip set (typically aStringof chars to strip).Hypothesis
&[u8]bytes with a bitmap-style ASCII check.Directions
Array[Boolean]mask once and index it by byte. Should collapse the inner loop to a single load + compare.asciiSafe(tracked onVal.Str), work on the byte array directly withoutString.charAtvirtual dispatch.stripCharsinto a single pass from both ends instead of two full scans.Part of the jrsonnet-parity effort tracked in #666.