You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
set_validated_blocks in crates/stateless-common/src/rpc_client.rs:593-612 and its caller in bin/stateless-validator/src/workers.rs:161-166 lack any request timeout. A TCP connection to the report endpoint that is accepted but never replies hangs the validation_reporter task indefinitely, silently severing all subsequent upstream reports for the session.
Current code
// crates/stateless-common/src/rpc_client.rs:593-612pubasyncfnset_validated_blocks(&self,blocks:Vec<ValidatedBlock>,) -> Result<(),ProviderError>{let provider = ...;
provider.client().request("...",(blocks,)).await// ^^^^^^^^^^^^^^^^^^ — no tokio::time::timeout wrapper, no round_robin_with_backoff}
// bin/stateless-validator/src/workers.rs:161-166
client.set_validated_blocks(reports).await// ^^^^^ — also no outer timeout
The alloy RootProvider built via connect_http uses reqwest (0.12/0.13) with no default request timeout.
PR feat: add per-method RPC timeout configuration #110 (open since 2026-03-31) adds block_timeout, witness_timeout, code_timeout as per-method configuration. It does not add a report/set_validated_blocks timeout.
The module-level doc at line 12 explicitly states "set_validated_blocks is unthrottled" — confirming this is the intended design, not an oversight. But the security/operational consequence of "unthrottled" extending to "no timeout at all" appears unintended given the parallel evolution of per_attempt_timeout on the other methods.
Impact
The validation_reporter is a separate task::spawn (workers.rs:55), so the main block validation pipeline is unaffected. What actually happens on a hung report call:
The reporter task's loop blocks on the .await indefinitely.
All subsequent validation reports for the session are silently dropped — upstream monitoring/coordination sees the validator go quiet without explicit error.
At shutdown (workers.rs:106), the reporter JoinHandle is awaited under a 3-second tokio::time::timeout. The handle does not resolve (the inner .await is still hung), so shutdown proceeds after the 3-second budget, leaving the task running until process exit.
The net effect: silent loss of all upstream reports for the rest of the session, plus a noisy shutdown.
Route set_validated_blocks through round_robin_with_backoff with n=1 (no retry). This reuses the existing per_attempt_timeout machinery and keeps the timeout policy centralized.
Option 2 is the smallest patch that fixes the root cause; option 3 is the cleanest architectural fit.
Secondary observation (related, informational)
The module-level doc at rpc_client.rs:21-25 claims: "every public method has a _with_deadline variant that takes an Option<Instant>." This is not quite accurate: get_block_unchecked (lines 460-469) has no _with_deadline companion. Worth either adding one (for trace-server callers that may want a bounded wait) or amending the module doc to note get_block_unchecked as an intentional exception.
set_validated_blocksincrates/stateless-common/src/rpc_client.rs:593-612and its caller inbin/stateless-validator/src/workers.rs:161-166lack any request timeout. A TCP connection to the report endpoint that is accepted but never replies hangs thevalidation_reportertask indefinitely, silently severing all subsequent upstream reports for the session.Current code
The alloy
RootProviderbuilt viaconnect_httpusesreqwest(0.12/0.13) with no default request timeout.Why this is not covered by existing PRs
per_attempt_timeout(default 20s) applied viatokio::time::timeoutinsideround_robin_with_backoff.set_validated_blocksdoes not go through that helper, so the new guard does not apply.block_timeout,witness_timeout,code_timeoutas per-method configuration. It does not add a report/set_validated_blockstimeout.per_attempt_timeouton the other methods.Impact
The
validation_reporteris a separatetask::spawn(workers.rs:55), so the main block validation pipeline is unaffected. What actually happens on a hung report call:.awaitindefinitely.JoinHandleis awaited under a 3-secondtokio::time::timeout. The handle does not resolve (the inner.awaitis still hung), so shutdown proceeds after the 3-second budget, leaving the task running until process exit.The net effect: silent loss of all upstream reports for the rest of the session, plus a noisy shutdown.
Suggested fix
Three options, increasing in invasiveness:
Caller-side timeout (minimal):
Reuse
per_attempt_timeoutinsideset_validated_blocks:Route
set_validated_blocksthroughround_robin_with_backoffwithn=1(no retry). This reuses the existingper_attempt_timeoutmachinery and keeps the timeout policy centralized.Option 2 is the smallest patch that fixes the root cause; option 3 is the cleanest architectural fit.
Secondary observation (related, informational)
The module-level doc at
rpc_client.rs:21-25claims: "every public method has a_with_deadlinevariant that takes anOption<Instant>." This is not quite accurate:get_block_unchecked(lines 460-469) has no_with_deadlinecompanion. Worth either adding one (for trace-server callers that may want a bounded wait) or amending the module doc to noteget_block_uncheckedas an intentional exception.