Skip to content

feat(raft): GET /v1/codeq/raft/status endpoint#586

Merged
osvaldoandrade merged 1 commit into
mainfrom
feat/raft-leader-forwarding
May 18, 2026
Merged

feat(raft): GET /v1/codeq/raft/status endpoint#586
osvaldoandrade merged 1 commit into
mainfrom
feat/raft-leader-forwarding

Conversation

@osvaldoandrade

Copy link
Copy Markdown
Owner

Summary

Adds a per-shard raft status endpoint that surfaces the local node's view of every raft group: leadership, self/peer IDs and bind addresses. Foundation for:

  • Ops monitoring ("is shard 3 still elected?")
  • Prometheus / Grafana dashboards
  • Future client-side smart routing ("send this write directly to shard 2's leader")

```
GET /v1/codeq/raft/status
{
"enabled": true,
"numGroups": 4,
"groups": [
{"shardIdx":0,"isLeader":true, "selfId":"node-1","selfAddr":"127.0.0.1:25000",
"leaderId":"node-1","leaderAddr":"127.0.0.1:25000","hasLeader":true},
{"shardIdx":1,"isLeader":false,"selfId":"node-1","selfAddr":"127.0.0.1:25001",
"leaderId":"node-2","leaderAddr":"127.0.0.1:25101","hasLeader":true}
]
}
```

When raft is disabled: `{"enabled":false,"numGroups":0}`.

What landed

  • `internal/raft.DB` exposes `LeaderInfo()` (uses hashicorp/raft's `LeaderWithID`), `SelfID()`, `BindAddr()`.
  • `pkg/app.Application` gains `RaftGroups []RaftGroupStatus` (interface), populated per shard.
  • `internal/controllers/raft_status_controller.go` handles `GET /v1/codeq/raft/status`; uses a local copy of the interface to avoid importing `pkg/app` from `internal/`.
  • Mounted under `anyAuth` — readable by producer or worker tokens. Payload reveals no task data, only routing metadata.

Why this instead of full leader forwarding

I scoped down. Server-side HTTP 307 redirects on `ErrNotLeader` require:

  • A new `PeerHTTPAddrs` config field
  • A typed error that threads the leader URL up from `raft.DB.Replicate` through every controller
  • Multi-shard awareness (controller doesn't know shardIdx)

Each of those is doable but adds up to a multi-day refactor. This endpoint is the smaller piece that ships value immediately and unblocks the rest: any future routing logic (server-side forwarding, client-side pinning, Prometheus exporter) can poll this endpoint to discover topology.

Test plan

  • `TestRaftStatusEndpoint_RaftDisabled` — non-raft deployment returns enabled=false, NumGroups=0
  • `TestRaftStatusEndpoint_SingleShard_Leader` — single-node bootstrap shows SelfID = LeaderID, IsLeader=true after election
  • All M1 + M2 tests still pass (no regressions)
  • Manual: `curl http://node-1/v1/codeq/raft/status\` against a 3-node × 4-shard test cluster — verify mixed leader/follower view

🤖 Generated with Claude Code

Local-node view of per-shard raft state, surfaced for ops tooling +
Prometheus scraping + as the building block for future client-side
smart routing.

Response shape:
  {
    "enabled": true,
    "numGroups": 4,
    "groups": [
      {"shardIdx":0,"isLeader":true, "selfId":"node-1","selfAddr":"127.0.0.1:25000",
       "leaderId":"node-1","leaderAddr":"127.0.0.1:25000","hasLeader":true},
      {"shardIdx":1,"isLeader":false,"selfId":"node-1","selfAddr":"127.0.0.1:25001",
       "leaderId":"node-2","leaderAddr":"127.0.0.1:25101","hasLeader":true},
      ...
    ]
  }

When raft is disabled: returns 200 with {"enabled":false,"numGroups":0}.

Wiring:
- internal/raft.DB grew LeaderInfo() (uses hraft.LeaderWithID), SelfID(),
  BindAddr() accessors.
- pkg/app.Application gained RaftGroups []RaftGroupStatus — populated
  per shard in application_pebble.go when cfg.Raft.Enabled.
- internal/controllers/raft_status_controller.go is the handler. It
  uses its own RaftGroupStatus interface (structurally identical to
  the public one) to avoid importing pkg/app from internal/.
- pkg/app/url_mappings.go mounts the route under anyAuth — readable
  by producer OR worker tokens; the payload reveals no task data, only
  routing metadata (peer IDs + bind addrs).

Tests:
- TestRaftStatusEndpoint_RaftDisabled: enabled=false, NumGroups=0.
- TestRaftStatusEndpoint_SingleShard_Leader: 1-node bootstrap shows
  SelfID=LeaderID=node-1, IsLeader=true, HasLeader=true after election.
@osvaldoandrade osvaldoandrade merged commit 0a0322f into main May 18, 2026
2 checks passed
@osvaldoandrade osvaldoandrade deleted the feat/raft-leader-forwarding branch May 18, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant