Summary
BrokerClient.ensure_connection() currently collapses all broker/connect/protocol failures into False, with only debug/error logs depending on where the failure occurs. This makes broker-related failures harder to diagnose, and there are no explicit read/write timeouts around socket operations (risk of indefinite hangs in worst-case failure modes).
This is low priority (broker is best-effort / fallback exists), but it would improve debuggability and robustness.
Current Behavior
ensure_connection(hostname) returns False on any error (no reason provided to caller).
- Failures are logged at debug in
ensure_connection(), and at error in connect() for some connection failures.
- No explicit timeouts around
open_unix_connection() / StreamReader.readexactly().
Proposed (Low-prio) Improvements
Option A (preferred):
- Keep
ensure_connection() -> bool for compatibility
- Add
ensure_connection_detailed() -> tuple[bool, str] returning a reason string on failure
- Add socket timeouts (configurable; default small) and tests proving no indefinite hang
Option B:
- Return
(bool, error_msg) from ensure_connection() directly (API change)
Test Ideas
- Stale socket file exists but no server listening
- Broker process dies mid-test (connection reset)
- Partial response / protocol error
- Timeout on response length / response body read
Related
Summary
BrokerClient.ensure_connection()currently collapses all broker/connect/protocol failures intoFalse, with only debug/error logs depending on where the failure occurs. This makes broker-related failures harder to diagnose, and there are no explicit read/write timeouts around socket operations (risk of indefinite hangs in worst-case failure modes).This is low priority (broker is best-effort / fallback exists), but it would improve debuggability and robustness.
Current Behavior
ensure_connection(hostname)returnsFalseon any error (no reason provided to caller).ensure_connection(), and at error inconnect()for some connection failures.open_unix_connection()/StreamReader.readexactly().Proposed (Low-prio) Improvements
Option A (preferred):
ensure_connection() -> boolfor compatibilityensure_connection_detailed() -> tuple[bool, str]returning a reason string on failureOption B:
(bool, error_msg)fromensure_connection()directly (API change)Test Ideas
Related