Skip to content

perf: std.substr is 2.11× slower than jrsonnet #850

@He-Pin

Description

@He-Pin

Tracking issue for a specific perf gap found while comparing sjsonnet (native, master) against jrsonnet (master). Parent comparison: #666.

Observation

std.substr is 1.78× slower than jrsonnet on a tight loop.

Scenario: bench/resources/go_suite/substr.jsonnet — calls std.substr 101 times on a ~4 KB string.

mean min range
sjsonnet (native) 4.8 ± 0.5 ms 4.0 ms 4.0–5.7
jrsonnet 2.7 ± 0.6 ms 1.9 ms 1.9–4.0

Repro:

hyperfine --warmup 2 --runs 10 -N \
  "sjsonnet bench/resources/go_suite/substr.jsonnet" \
  "jrsonnet bench/resources/go_suite/substr.jsonnet"

Code

sjsonnet/src/sjsonnet/stdlib/StringModule.scala:130-180Substr builtin. For each call it allocates a new java.lang.String via str.substring(...) and wraps it in Val.Str.

Hypothesis

Two sources of overhead per call:

  1. String.substring always copies on modern JVMs (the shared-char[] optimization was removed in Java 7u6). For 101 substrings of a 4 KB string, that's 101 × up-to-4 KB allocations.
  2. Val.Str wrapping + codepoint-length path on the resulting string when it may already be ASCII-safe (the original is).

jrsonnet's strings are UTF-8 &str slices into the original buffer — no copy for a substring.

Directions

  • For ASCII-safe inputs, skip the codepoint re-scan and use Val.Str.asciiSafe.
  • Explore a lightweight "string slice" value (offset + length + base String) for hot substring workloads, materialized to a real String only at render/materialize time. This is a bigger change — open question whether the allocation win justifies the complexity against the rest of the stdlib that assumes String.
  • Cache Val.Str(empty) for len == 0.

Part of the jrsonnet-parity effort tracked in #666.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions