fix: enable WebSocket ping heartbeat to detect dead connections#6
Conversation
heartbeat_interval was defined in config (default 30s) but never used — ping_interval/ping_timeout were both None, so silently dead connections were never detected and the receive loop hung indefinitely. Now passes heartbeat_interval to websockets ping_interval and ping_timeout. A missed pong raises ConnectionClosed, exits _listen(), and triggers the reconnect loop which re-syncs any missed events.
Reviewer's guide (collapsed on small PRs)Reviewer's GuideThis PR enables WebSocket heartbeat-based keepalive by wiring the existing heartbeat_interval configuration into the websockets.connect call, so dead connections are detected and the existing reconnect logic can run. Updated class diagram for WebSocket client heartbeat configurationclassDiagram
class CLIClient {
Config config
WebSocket ws
_connect() void
_listen() void
_on_reconnect() void
}
class Config {
ClientConfig client
}
class ClientConfig {
int heartbeat_interval
}
class WebSocket {
connect(url, max_size, ping_interval, ping_timeout, close_timeout) WebSocket
}
CLIClient --> Config : uses
Config --> ClientConfig : has
CLIClient --> WebSocket : establishes
%% Detail of _connect behavior
class CLIClient_connect {
_connect() void
}
CLIClient ..> CLIClient_connect : behavior
CLIClient_connect : interval = config.client.heartbeat_interval
CLIClient_connect : ws = websockets.connect(
CLIClient_connect : url,
CLIClient_connect : max_size=128MB,
CLIClient_connect : ping_interval=interval,
CLIClient_connect : ping_timeout=interval,
CLIClient_connect : close_timeout=10
CLIClient_connect : )
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Consider handling edge cases for
heartbeat_interval(e.g.,Noneor very small/zero values) explicitly before passing it towebsockets.connect, so misconfiguration doesn't lead to confusing runtime errors or tight ping loops. - You might want to allow configuring
ping_intervalandping_timeoutseparately rather than tying both toheartbeat_interval, in case future tuning requires a shorter timeout than interval or vice versa.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider handling edge cases for `heartbeat_interval` (e.g., `None` or very small/zero values) explicitly before passing it to `websockets.connect`, so misconfiguration doesn't lead to confusing runtime errors or tight ping loops.
- You might want to allow configuring `ping_interval` and `ping_timeout` separately rather than tying both to `heartbeat_interval`, in case future tuning requires a shorter timeout than interval or vice versa.
## Individual Comments
### Comment 1
<location path="fns_cli/client.py" line_range="124" />
<code_context>
)
log.info("Connecting to %s", url)
+ interval = self.config.client.heartbeat_interval
self.ws = await websockets.connect(
url,
</code_context>
<issue_to_address>
**issue (bug_risk):** Validate/normalize `heartbeat_interval` before passing it to `websockets.connect`.
Because `ping_interval=None` previously disabled ping/pong, passing `self.config.client.heartbeat_interval` directly means invalid values (e.g. `0` or negative) could now cause `websockets.connect` to raise or act unexpectedly. Consider normalizing here (e.g., map `<= 0` to `None` or enforce a minimum positive value) so behavior is robust against misconfigured heartbeat values.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| ) | ||
| log.info("Connecting to %s", url) | ||
|
|
||
| interval = self.config.client.heartbeat_interval |
There was a problem hiding this comment.
issue (bug_risk): Validate/normalize heartbeat_interval before passing it to websockets.connect.
Because ping_interval=None previously disabled ping/pong, passing self.config.client.heartbeat_interval directly means invalid values (e.g. 0 or negative) could now cause websockets.connect to raise or act unexpectedly. Consider normalizing here (e.g., map <= 0 to None or enforce a minimum positive value) so behavior is robust against misconfigured heartbeat values.
… timeout PR #6 used ping_interval=30 but the server does not respond to WebSocket ping frames — confirmed by real-environment test: connection dropped every 30s with "keepalive ping timeout". Reverted to ping_interval=None. Instead, an asyncio background task (_inactivity_watchdog) tracks the last received message time and closes the connection if no data arrives for 2 × heartbeat_interval (60s default), triggering the reconnect loop. _last_received_at is updated on every text or binary frame received. Also extended the NoteSync timeout in _initial_sync from 60s to 300s. The server sends NoteSyncEnd immediately but may delay actual note delivery by over a minute on full syncs — the 60s window was too short and caused NoteSync to abort before any notes were written. Verified with real-server pull: 24 notes (including Test-456/123.md, 456.md, 789.md) fully synced, no ping disconnects, no echo push-backs.
… timeout (#8) PR #6 used ping_interval=30 but the server does not respond to WebSocket ping frames — confirmed by real-environment test: connection dropped every 30s with "keepalive ping timeout". Reverted to ping_interval=None. Instead, an asyncio background task (_inactivity_watchdog) tracks the last received message time and closes the connection if no data arrives for 2 × heartbeat_interval (60s default), triggering the reconnect loop. _last_received_at is updated on every text or binary frame received. Also extended the NoteSync timeout in _initial_sync from 60s to 300s. The server sends NoteSyncEnd immediately but may delay actual note delivery by over a minute on full syncs — the 60s window was too short and caused NoteSync to abort before any notes were written. Verified with real-server pull: 24 notes (including Test-456/123.md, 456.md, 789.md) fully synced, no ping disconnects, no echo push-backs.
Problem
heartbeat_interval: 30was defined in config but never used. Bothping_intervalandping_timeoutwere set toNone, disabling all keepalive. When the WebSocket connection silently died, theasync for raw in self.wsreceive loop hung indefinitely — no error, no reconnect, no incoming events.Confirmed in logs: connection established at 08:23:28, last activity at 08:45:08, then complete silence. Files created on the server after that point were never received by the CLI.
Fix
Pass
heartbeat_intervalfrom config toping_intervalandping_timeouton the WebSocket connection. The websockets library sends a ping every 30s and expects a pong within 30s. A missed pong raisesConnectionClosed, which exits_listen(), triggers the reconnect loop, and_on_reconnectruns an incremental sync to catch up on missed events.Test plan
run, let it idle for >30s, confirm ping/pong in debug logsSummary by Sourcery
Bug Fixes: