Skip to content

perf: Large string template (% operator) is 2.71× slower than jrsonnet #847

@He-Pin

Description

@He-Pin

Tracking issue for a specific perf gap found while comparing sjsonnet (native, master) against jrsonnet (master). Parent comparison: #666. Biggest single gap in the comparison — worth prioritizing.

Observation

Large string template (% format operator on a multi-KB text block) is 2.90× slower than jrsonnet.

Scenario: bench/resources/cpp_suite/large_string_template.jsonnet — applies |||...||| % { x: 3 } on a ~7.8k-line text block of mostly ASCII.

mean min
sjsonnet (native) 11.3 ± 0.7 ms 10.5 ms
jrsonnet 3.9 ± 0.7 ms 3.0 ms

Repro:

hyperfine --warmup 2 --runs 10 -N \
  "sjsonnet bench/resources/cpp_suite/large_string_template.jsonnet" \
  "jrsonnet bench/resources/cpp_suite/large_string_template.jsonnet"

Code

Two hot paths:

  1. sjsonnet/src/sjsonnet/Format.scala% operator builds the formatted string char-by-char into a StringBuilder.
  2. sjsonnet/src/sjsonnet/BaseByteRenderer.scala:309-348visitLongString renders the final string into JSON. Calls str.getBytes(UTF_8), runs SWAR findFirstEscapeChar, then copies chunks between escapes.

Since x has only one occurrence and the template contains mostly literal text with sparse \n, the format engine is essentially a giant memcpy — jrsonnet manages this with roughly zero copies.

Hypothesis

  • Double conversion: jsonnet string is UTF-16 String. Format.scala builds into StringBuilder (UTF-16). Then JSON render does str.getBytes(UTF_8) — a full UTF-8 encode pass. That's the conversion cost base64 encode/decode is ~6x slower than jrsonnet on large payloads #779 describes, paid once on an ~N KB output.
  • Format engine scans every character even when there are no format specifiers in a long literal run.
  • Large string literal parse/alloc: the |||...||| block is a ~600 KB literal. Parser allocates it once, but if the format engine then concatenates the unchanged literal text into a new StringBuilder, that's an extra allocation.

Directions

  • Short-term: In Format.scala, detect long literal runs between format specifiers and use StringBuilder.append(String, start, end) (which avoids per-char virtual dispatch) or bulk arraycopy.
  • Medium-term: When Val.Str is asciiSafe (tracked via Val.Str.asciiSafe), skip the getBytes(UTF_8) in BaseByteRenderer.visitLongString and reuse the char-to-byte fast path already used by renderAsciiSafeString. This is the single biggest lever against the real-world kube-prometheus gap (which also emits large manifests of mostly-ASCII strings).
  • Longer-term: Consider a byte-backed Val.Str variant for pre-decoded strings read from disk or already known to be ASCII/UTF-8 bytes — avoids the UTF-16 round-trip entirely. Overlaps with base64 encode/decode is ~6x slower than jrsonnet on large payloads #779.

Part of the jrsonnet-parity effort tracked in #666.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions