Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 68 additions & 10 deletions Source/WTF/wtf/posix/ThreadingPOSIX.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,42 @@

static std::atomic<Thread*> targetThread { nullptr };

#if HAVE(MACHINE_CONTEXT)
// Return the stack pointer the thread was running at when the signal arrived,
// read straight out of the kernel-saved register state. Unlike
// currentStackPointer() this is stable regardless of whether the handler
// itself runs on the normal stack or the alternate signal stack.
//
// The per-arch field selection below mirrors JSC::MachineContext::stackPointerImpl(mcontext_t&)
// in Source/JavaScriptCore/runtime/MachineContext.h. WTF sits below JSC and can't include that
// header, so the accessors are duplicated here. Keep the two in sync: MachineContext.h fails to
// compile (#error Unknown Architecture) on an unlisted arch, but this copy falls through to the
// nullptr branch, which the caller treats as "use currentStackPointer()" and silently reintroduces
// the SA_ONSTACK deadlock this function exists to avoid. An arch added to MachineContext.h must be
// added here too.
static inline void* interruptedStackPointer(ucontext_t* ucontext)
{
#if OS(LINUX) && CPU(X86_64)
return reinterpret_cast<void*>(ucontext->uc_mcontext.gregs[REG_RSP]);
#elif OS(LINUX) && CPU(ARM64)
return reinterpret_cast<void*>(ucontext->uc_mcontext.sp);
#elif OS(LINUX) && CPU(ARM)
return reinterpret_cast<void*>(ucontext->uc_mcontext.arm_sp);
#elif OS(LINUX) && CPU(RISCV64)
return reinterpret_cast<void*>(ucontext->uc_mcontext.__gregs[REG_SP]);
#elif OS(FREEBSD) && CPU(X86_64)
return reinterpret_cast<void*>(ucontext->uc_mcontext.mc_rsp);
#elif OS(FREEBSD) && CPU(ARM64)
return reinterpret_cast<void*>(ucontext->uc_mcontext.mc_gpregs.gp_sp);
#else
// No accessor for this arch: fall back to currentStackPointer() in the caller. See the sync
// note above; prefer adding the arch's SP field here over relying on the fallback.
UNUSED_PARAM(ucontext);
return nullptr;
#endif
}
Comment thread
robobun marked this conversation as resolved.
#endif

void Thread::signalHandlerSuspendResume(int, siginfo_t*, void* ucontext)
{
// Touching a global variable atomic types from signal handlers is allowed.
Expand All @@ -130,25 +166,41 @@
return;
}

void* approximateStackPointer = currentStackPointer();
if (!thread->m_stack.contains(approximateStackPointer)) {
// This happens if we use an alternative signal stack.
// 1. A user-defined signal handler is invoked with an alternative signal stack.
// 2. In the middle of the execution of the handler, we attempt to suspend the target thread.
// 3. A nested signal handler is executed.
// 4. The stack pointer saved in the machine context will be pointing to the alternative signal stack.
// In this case, we back off the suspension and retry a bit later.
// We need to decide whether the interrupted thread is parked somewhere
// we can safely snapshot from, i.e. running on its own normal stack
// rather than inside a nested signal handler on an alt stack. When
// possible, read the SP the thread was running at straight from the
// ucontext (kernel-saved register state); that SP is stable regardless
// of whether the handler itself runs on the normal or the alternate
// signal stack. Using currentStackPointer() here only works so long as
// the handler always runs on the normal stack, which does not hold if
// the handler's SA_ONSTACK flag is set. Ours isn't by default, but
// any runtime loaded via dlopen (Go's cgo init, Mono, some JNI
// layouts, Rust cdylibs) may force it on. Fall back to the handler's
// own SP on platforms where we don't have a machine context.
#if HAVE(MACHINE_CONTEXT)
ucontext_t* userContext = static_cast<ucontext_t*>(ucontext);
void* interruptedSP = interruptedStackPointer(userContext);
void* stackPointerToCheck = interruptedSP ? interruptedSP : currentStackPointer();
#else
void* stackPointerToCheck = currentStackPointer();
#endif
if (!thread->m_stack.contains(stackPointerToCheck)) {
// The signal genuinely interrupted code running on an alternative
// stack, for example a nested signal handler already running on
// the alt stack. The snapshot we'd take here wouldn't represent
// the thread's normal-stack state. Back off and let the caller
// retry once the outer frame has returned to the normal stack.
thread->m_platformRegisters = nullptr;
globalSemaphoreForSuspendResume->post();
return;
}

#if HAVE(MACHINE_CONTEXT)
ucontext_t* userContext = static_cast<ucontext_t*>(ucontext);
thread->m_platformRegisters = &registersFromUContext(userContext);
#else
UNUSED_PARAM(ucontext);
PlatformRegisters platformRegisters { approximateStackPointer };
PlatformRegisters platformRegisters { stackPointerToCheck };
thread->m_platformRegisters = &platformRegisters;
#endif

Expand Down Expand Up @@ -185,6 +237,12 @@
// features. We tell it to use SIGPWR instead, which we assume is unlikely to be reliable
// for its stated purpose. Mono's garbage collector also uses SIGPWR:
// https://www.mono-project.com/docs/advanced/embedding/#signal-handling
//
// Runtimes loaded via dlopen (Go's cgo init, Mono, some JNI layouts) may add SA_ONSTACK to
// this handler's sigaction flags when they re-install inherited signal handlers, so the
// handler can then be delivered on an alternate signal stack. signalHandlerSuspendResume
// stays correct because it reads the interrupted stack pointer from the ucontext rather
// than its own frame pointer. See oven-sh/bun#31158.

Check warning on line 245 in Source/WTF/wtf/posix/ThreadingPOSIX.cpp

View check run for this annotation

Claude / Claude Code Review

Comment says 'frame pointer' but means 'stack pointer'

The comment says the handler reads the interrupted SP from ucontext "rather than its own frame pointer", but the pre-fix code called `currentStackPointer()`, which reads the **stack pointer** (SP/RSP), not the frame pointer (FP/RBP). The parallel comment inside `signalHandlerSuspendResume` correctly says "the handler's own SP"; suggest matching that here: "…rather than the handler's own stack pointer."
Comment on lines +244 to +245

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The comment says the handler reads the interrupted SP from ucontext "rather than its own frame pointer", but the pre-fix code called currentStackPointer(), which reads the stack pointer (SP/RSP), not the frame pointer (FP/RBP). The parallel comment inside signalHandlerSuspendResume correctly says "the handler's own SP"; suggest matching that here: "…rather than the handler's own stack pointer."

Extended reasoning...

What the comment says vs. what the code did

The new comment in initializePlatformThreading() reads:

signalHandlerSuspendResume stays correct because it reads the interrupted stack pointer from the ucontext rather than its own frame pointer.

The contrast being drawn is between the new behaviour (read the interrupted thread's SP from ucontext->uc_mcontext) and the old behaviour. But the old behaviour was a call to currentStackPointer() — defined in wtf/StackPointer.h — which returns the architectural stack pointer (RSP on x86_64, sp on arm64, etc.). It does not read the frame pointer (RBP / FP). So the comment names the wrong register.

Why the distinction matters here

In most prose, "frame pointer" used loosely for "a pointer into the current frame" would be harmless. But this is a file that has just added interruptedStackPointer(), which literally pulls gregs[REG_RSP] / .sp / .arm_sp / mc_rsp out of mcontext_t, and the whole PR is about distinguishing which SP gets checked. In that context, SP vs. FP is not loose terminology — a reader familiar with the difference may pause and wonder whether the old code did FP-based introspection (it didn't), or whether there's some FP-vs-SP subtlety in the alt-stack case (there isn't).

There's also an internal inconsistency: the parallel comment inside signalHandlerSuspendResume itself, added in the same diff, gets it right — "Fall back to the handler's own SP on platforms where we don't have a machine context." The two comments describe the same fallback and should use the same term.

Addressing the refutation

One verifier argued this is below the bar because the intent is clear from context and the wording came from a reviewer's suggested comment. Both points are true — no one will misread the mechanism. But the fix is a one-word change in a comment that is specifically about register-level behaviour, and the same PR already uses the correct term twenty lines up. Aligning the two costs nothing and removes a small "wait, frame pointer?" stumble for future readers. That's squarely nit territory, and it's filed as such.

Step-by-step

  1. Pre-fix code: void* approximateStackPointer = currentStackPointer(); — reads SP of the handler's current frame.
  2. Post-fix code: reads interruptedStackPointer(userContext) from ucontext, falling back to currentStackPointer().
  3. The comment at line 245 describes (2) as "reads the interrupted stack pointer from the ucontext rather than its own frame pointer".
  4. Neither (1) nor the fallback in (2) ever touched the frame pointer; both read the stack pointer.
  5. Therefore "frame pointer" → "stack pointer" (or "the handler's own stack pointer", matching the in-handler comment).

Impact

None at runtime — comment-only. Documentation accuracy in low-level register-handling code.

Fix

-        // stays correct because it reads the interrupted stack pointer from the ucontext rather
-        // than its own frame pointer. See oven-sh/bun#31158.
+        // stays correct because it reads the interrupted stack pointer from the ucontext rather
+        // than the handler's own stack pointer. See oven-sh/bun#31158.

g_wtfConfig.sigThreadSuspendResume = SIGPWR;
#else
g_wtfConfig.sigThreadSuspendResume = SIGUSR1;
Expand Down
Loading