Skip to content

GraalVM native-image built with clikt 5.x segfaults inside GC when run from a TTY (works on clikt 4.4.0) #640

Description

@Elizaveta239

Summary

A GraalVM native-image binary built against clikt 5.1.0 (which transitively
brings mordant 3.0.2) segfaults at runtime when launched from a real terminal.
The same source built against clikt 4.4.0 (mordant 2.5.0) runs fine.

The crash is detected by the GraalVM segfault handler inside the GC heap walker:

PC ... points into AOT compiled code
   com.oracle.svm.core.genscavenge.remset.AlignedChunkRememberedSet.walkObjects
siginfo: si_signo: 10, si_code: 1, si_addr: 0x0000000280000060 (heapBase + 96)

This is consistent with GC encountering an object whose DynamicHub is invalid
— the same family of failure that produces the user-visible message
Fatal error: Object with invalid hub type. in some GraalVM versions.

The CLI in our project does not use any clikt feature beyond CliktCommand /
subcommands / echo / option, and the crash is not specific to any
particular command body — a run() that just allocates a few ByteArrays
is enough to trigger it.

Reproduction

Minimal project: https://github.com/Elizaveta239/clikt-graalvm-repro

git clone https://github.com/Elizaveta239/clikt-graalvm-repro
cd clikt-graalvm-repro
./gradlew nativeCompile -PcliktVersion=5.1.0

# crashes in any TTY (real terminal or pseudo-tty via `script`):
script -q /tmp/out.txt ./build/native/nativeCompile/repro allocate
# -> "[ [ SegfaultHandler caught a segfault ... ] ]"
#    "PC ... AlignedChunkRememberedSet.walkObjects"

# same binary, no TTY: works
./build/native/nativeCompile/repro allocate
# -> "Allocated, sink size = 100"

# rebuild against clikt 4.4.0 — no crash, even in TTY:
./gradlew clean nativeCompile -PcliktVersion=4.4.0
script -q /tmp/out.txt ./build/native/nativeCompile/repro allocate
# -> "Allocated, sink size = 100"

The full project is ~9 small files. The relevant source is:

class Allocate : CliktCommand(name = "allocate") {
    override fun run() {
        echo("Allocating...")
        var sink: List<ByteArray> = emptyList()
        repeat(2000) {
            sink = sink + ByteArray(64 * 1024)
            if (sink.size > 100) sink = sink.takeLast(100)
        }
        echo("Allocated, sink size = ${sink.size}")
    }
}

A captured crash dump is in sample-crash-output.txt.

Expected behavior

Native-image binary runs allocate to completion in any environment, like it
does on clikt 4.4.0 and like it does on clikt 5.1.0 when stdout is piped.

Actual behavior

Native-image binary segfaults inside AlignedChunkRememberedSet.walkObjects
during young-gen GC, but only when stdout is a TTY.

Trigger conditions

All of the following must hold:

  • clikt 5.1.0 (mordant 3.0.2). 4.4.0 (mordant 2.5.0) does not crash.
  • Built as a GraalVM native image.
  • Binary launched from a TTY (real terminal or pseudo-tty via script).
    Piping stdout to a file or another process — no crash.
  • Allocations in run() sufficient to trigger a young-gen GC (~100 MB churn here).

TERM value is irrelevant — xterm-256color, iTerm.app, and dumb all
crash equally. The discriminator is isatty(stdout), i.e. whether mordant
takes its terminal-detection code path or the dumb fallback.

Environment

  • macOS 13.5 aarch64
  • GraalVM CE 21.0.9 (build 21.0.9+7-LTS-jvmci-23.1-b79)
  • Kotlin 2.3.20, Gradle 9.2.1
  • graalvm-buildtools-plugin 0.10.6
  • clikt 5.1.0 → mordant 3.0.2 → colormath 3.6.0 (crashes)
  • clikt 4.4.0 → mordant 2.5.0 → colormath 3.5.0 (works)

Hypothesis

Build-time-initialised mordant Terminal/Mpp state is capturing an object
whose DynamicHub is later invalidated. The crash surfaces only when the
TTY code path runs (mordant decides the output is a real terminal), which
is consistent with mordant 3.x's expanded native-image substitutions for
terminal detection. Since clikt 4.4.0 / mordant 2.5.0 with the same source
and the same build flags does not crash, the regression is somewhere
between mordant 2.5.0 and 3.0.2.

I added the recommended --initialize-at-build-time=com.github.ajalt.mordant.internal.MppImplKt
hint via META-INF/native-image/com.github.ajalt.mordant/mordant/native-image.properties
during diagnosis; it did not change the outcome and was removed from the
final repro to minimize variables.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions