Skip to content

fix(drift): close active connections on shutdown to prevent Ctrl+C hang#219

Merged
sohil-kshirsagar merged 5 commits intomainfrom
sohil/fix/graceful-shutdown-hanging
Apr 11, 2026
Merged

fix(drift): close active connections on shutdown to prevent Ctrl+C hang#219
sohil-kshirsagar merged 5 commits intomainfrom
sohil/fix/graceful-shutdown-hanging

Conversation

@sohil-kshirsagar
Copy link
Copy Markdown
Contributor

@sohil-kshirsagar sohil-kshirsagar commented Apr 8, 2026

tusk drift run --print hangs on Ctrl+C when the mock server has an active SDK connection.

Root cause

server.Stop() calls ms.wg.Wait(), which blocks until all handleConnection goroutines finish. Those goroutines are stuck on io.ReadFull(conn, ...) — a blocking read with no deadline. Closing the listener prevents new connections but doesn't close existing ones, so io.ReadFull never returns and the process hangs indefinitely.

Changes

  • Track active connections in a map and close them all in Stop() before wg.Wait(), so blocked reads return immediately with an error
  • Add force-exit on second Ctrl+C as a safety net — if cleanup itself hangs for any reason, users aren't stuck
  • Add TestServerStopClosesActiveConnections — verified it fails without the fix (hangs for 2s then times out) and passes with it

…rl+C hang

When pressing Ctrl+C during `tusk drift run`, the process would hang
indefinitely because server.Stop() waited for handleConnection goroutines
that were blocked on io.ReadFull. Closing the listener only prevented new
connections — existing ones were never closed.

Also adds force-exit on second Ctrl+C as a safety net.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6418f90. Configure here.

Closes the race window where Stop() could iterate activeConns before
handleConnection had a chance to register the conn, leaving it unclosed.
@sohil-kshirsagar sohil-kshirsagar requested a review from jy-tan April 8, 2026 23:30
@sohil-kshirsagar sohil-kshirsagar marked this pull request as ready for review April 8, 2026 23:32
In the TUI, Bubble Tea's raw terminal mode converts Ctrl+C into a key
event rather than an OS signal. Since cleanup runs synchronously in the
update loop, a second Ctrl+C can't be delivered if cleanup hangs. Spawn
a goroutine that force-exits after 5s as a safety net.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/runner/server.go">

<violation number="1" location="internal/runner/server.go:636">
P1: Shutdown still has a race: a connection accepted right around cancellation can be added after `Stop()` closes active connections, leaving a handler blocked on `io.ReadFull` and potentially hanging `wg.Wait()`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment on lines +636 to +638
ms.activeConns[conn] = struct{}{}
ms.activeConnsMu.Unlock()

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Shutdown still has a race: a connection accepted right around cancellation can be added after Stop() closes active connections, leaving a handler blocked on io.ReadFull and potentially hanging wg.Wait().

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/runner/server.go, line 636:

<comment>Shutdown still has a race: a connection accepted right around cancellation can be added after `Stop()` closes active connections, leaving a handler blocked on `io.ReadFull` and potentially hanging `wg.Wait()`.</comment>

<file context>
@@ -622,6 +632,10 @@ func (ms *Server) acceptConnections() {
 			}
 
+			ms.activeConnsMu.Lock()
+			ms.activeConns[conn] = struct{}{}
+			ms.activeConnsMu.Unlock()
+
</file context>
Suggested change
ms.activeConns[conn] = struct{}{}
ms.activeConnsMu.Unlock()
if ms.ctx.Err() != nil {
_ = conn.Close()
return
}
ms.activeConnsMu.Lock()
ms.activeConns[conn] = struct{}{}
ms.activeConnsMu.Unlock()
Fix with Cubic

The connection-closing fix in server.Stop() makes cleanup return
promptly, so a timeout safety net is unnecessary. If cleanup ever
hangs again, the right fix is to address the root cause.
@sohil-kshirsagar sohil-kshirsagar merged commit 5420625 into main Apr 11, 2026
14 checks passed
@sohil-kshirsagar sohil-kshirsagar deleted the sohil/fix/graceful-shutdown-hanging branch April 11, 2026 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants