Original Logs
20260413 04:38:40.568 ctool [INFO] run_cloudtool_service_real going down!
20260413 04:38:40.568 ctool [ERROR] 🛑 caught exception TimeoutError:
Traceback (most recent call last):
File "/usr/local/lib/python3.13/site-packages/gql/transport/common/adapters/websockets.py", line 71, in connect
self.websocket = await websockets.connect(self.url, **connect_args)
File "/usr/local/lib/python3.13/site-packages/websockets/asyncio/client.py", line 470, in create_connection
_, connection = await loop.create_connection(factory, **kwargs)
File "/usr/local/lib/python3.13/asyncio/base_events.py", line 1146, in create_connection
sock = await self._connect_sock(
File "/usr/local/lib/python3.13/asyncio/selector_events.py", line 645, in sock_connect
return await fut
TimeoutError
Error Summary
Multiple cloudtool-related pods reported the same websocket connect timeout pattern. Affected services included cloudtool web, original, eds-setup, and remote-mcp-worker. The pods were otherwise Running, and the backend reports only scheduling pressure / delayed startup context, not an ongoing crash loop.
Stacktrace
/usr/local/lib/python3.13/site-packages/gql/transport/common/adapters/websockets.py:71 connect
Root Cause
- File:
flexus_client_kit/ckit_cloudtool.py:469-470
- Function:
run_cloudtool_service_real
- Why: the service opens a websocket subscription to the backend and, under startup pressure, the connection attempt times out inside
websockets.connect(...). The surrounding code catches the failure at the top-level service loop and retries, so this is an operational connectivity/startup issue rather than an unhandled code crash.
- Git blame:
Oleg Klimov in acffd604 / 1c9b39b8
Code Snippet
async with ws_client as ws:
async for r in ws.subscribe(gql.gql(...)):
...
Affected
- Pods: fservice-cloudtool-web, fservice-cloudtool-original, fservice-cloudtool-eds-setup, fservice-remote-mcp-worker
- Namespace: flexus
- Occurrences: multiple startup retries
Original Logs
Error Summary
Multiple cloudtool-related pods reported the same websocket connect timeout pattern. Affected services included cloudtool web, original, eds-setup, and remote-mcp-worker. The pods were otherwise Running, and the backend reports only scheduling pressure / delayed startup context, not an ongoing crash loop.
Stacktrace
Root Cause
flexus_client_kit/ckit_cloudtool.py:469-470run_cloudtool_service_realwebsockets.connect(...). The surrounding code catches the failure at the top-level service loop and retries, so this is an operational connectivity/startup issue rather than an unhandled code crash.Oleg Klimovinacffd604/1c9b39b8Code Snippet
Affected