feat(raft): server-side 307 redirect on ErrNotLeader#590
Merged
Conversation
Writes that hit a follower now respond with HTTP 307 Temporary Redirect + Location header pointing at the leader's HTTP URL. Go's http.Client follows 307 transparently (RFC 7231 preserves method + body), so naïve clients land on the leader in one extra round-trip instead of bouncing back with an error. Wiring (layered to avoid circular deps): - pkg/domain/errors.go (new): LeaderHint interface — just error + LeaderHTTPAddr() string. Shared vocabulary between the storage layer that constructs the error and the HTTP layer that interprets it. - internal/repository/pebble.NotLeaderError: typed error with LeaderURL field. Satisfies domain.LeaderHint. errors.Is(err, ErrNotLeader) still matches because NotLeaderError.Is matches the sentinel — existing callers don't change. - pebble.Replicator gains LeaderHTTPAddr() — the raft.DB already exposes it (PR #589). pebble's Set/Delete/CommitBatch build NotLeaderError{LeaderURL: d.repl.LeaderHTTPAddr()} when the local node isn't the leader. - internal/controllers/respond.go: respondWriteError + maybeRedirectLeader helpers. errors.As against domain.LeaderHint, write 307 + Location built from leader URL + the original request URI. - Controllers updated to call the helpers on service errors: create_task, claim_task, submit_result, heartbeat, nack, abandon. Batch endpoints kept on the per-item error path (mixed-shard batches can't redirect cleanly). Test: - TestRaft_307RedirectsToLeader: 3-node single-shard cluster. Identifies leader + follower via probe writes; sends to follower with redirect-following disabled, asserts 307 + Location starts with leader URL + ends with /v1/codeq/tasks. Then re-sends with standard http.Client (follows redirects automatically), asserts 202. ~210ms run. All existing tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The M2 multi-raft bench (#588) showed throughput parity-at-best vs single-shard, dominated by HTTP-level retry on `ErrNotLeader`. This PR closes that gap for the realistic case: writes that hit a follower now respond with HTTP 307 + `Location: <leader URL + original path>`. Go's `http.Client` follows 307 transparently (RFC 7231 preserves method + body) so naïve clients land on the leader in one extra round-trip instead of bouncing back with an error.
What landed
Layered to avoid circular deps:
Behavior on the wire
```
POST /v1/codeq/tasks → 307 Temporary Redirect
Location: http://leader-host:8080/v1/codeq/tasks
Body: {"error":"not leader","leader":"http://leader-host:8080\"}
```
A standard Go `http.Client` follows automatically and the second POST lands on the leader returning 202.
Test plan
🤖 Generated with Claude Code