fix: do not return 502 on runner live-log client disconnect#4900
Closed
cursor[bot] wants to merge 1 commit into
Closed
fix: do not return 502 on runner live-log client disconnect#4900cursor[bot] wants to merge 1 commit into
cursor[bot] wants to merge 1 commit into
Conversation
The runner live-log streaming endpoint proxies an upstream task-broker HTTP request. When the client (browser) aborts the in-flight fetch (e.g. closes the live-log dialog or navigates away), the request context is cancelled and http.Client.Do returns a context error. The handler was treating that as an upstream failure and responding with 502 Bad Gateway, which the logging middleware then captured as a Sentry event. Detect cancellations via the request context / error chain and respond with the non-5xx 499 status instead, matching the nginx convention for client-closed requests. This keeps genuine upstream connectivity failures returning 502 (and visible in Sentry) while suppressing the expected client-cancel noise for this streaming endpoint. Co-authored-by: Aleksandar Mitrovic <AleksandarCole@users.noreply.github.com>
|
👋 Commands for maintainers:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Sentry surfaced a noisy info-level event for the runner live-log streaming endpoint:
handleRunnerLiveLogStreaminpkg/public/runner_live_log_stream.gois a streaming proxy in front of the task-broker live-log endpoint. It returned502 Bad Gatewaywheneverhttp.Client.Do(req)failed.For a streaming endpoint, by far the most common reason for that call to fail is the client aborting the request — for example, the user closing the live-log dialog or navigating away from the page. The frontend (
web_src/src/ui/CanvasPage/RunnerLiveLogDialog/liveLogStream.ts) explicitly does this viaAbortController.abort(). When that happens, the request context is cancelled andDoreturns acontext.Cancelederror.The logging middleware (
pkg/public/middleware/logging.go) then captured those 502s as Sentry events (theshouldCaptureHTTPErrorpredicate captures every 5xx except 501 and 505), generating noise that does not reflect a real server bug.Fix
Detect client disconnects in the handler and respond with
499 Client Closed Request(nginx convention) instead of502 Bad Gateway. Since499 < 500, the Sentry-capture middleware ignores it — but real upstream connectivity failures still return 502 and remain visible in Sentry.Implementation:
isClientDisconnect(ctx, err)helper that checks bothctx.Err()and the error chain (context.Canceled/context.DeadlineExceeded).http.NewRequestWithContextandhttp.DefaultClient.Doerror paths.statusClientClosedRequest = 499constant with a comment explaining the rationale (so the magic number is self-documenting).Tests
Added a new subtest
TestHandleRunnerLiveLogStream/client_disconnect_does_not_return_5xxthat points the handler at an upstreamhttptest.Serverwhich blocks on the request context, then cancels the request context from the client side. It asserts the response is499, not502.Also verified:
make format.go— cleanmake lint— cleanmake check.build.app— succeedsRisk
Very low. The behavior change is scoped to a single streaming handler, and the new branch is only taken when the request context has already been cancelled (i.e. the client is no longer listening to the response anyway). Genuine upstream errors continue to return
502as before.