Skip to content

feat(raft): RAFT_* env vars + 3-node compose template#588

Merged
osvaldoandrade merged 1 commit into
mainfrom
feat/raft-compose-template
May 18, 2026
Merged

feat(raft): RAFT_* env vars + 3-node compose template#588
osvaldoandrade merged 1 commit into
mainfrom
feat/raft-compose-template

Conversation

@osvaldoandrade

Copy link
Copy Markdown
Owner

Summary

Operationally, raft was hard to deploy: `cfg.Raft` was YAML-only (no env-var overrides) and `deploy/docker-compose/` had no template for a multi-node raft cluster. Closes both gaps.

What landed

RAFT_* env var bindings

`pkg/config/config.go` gains overrides for every `cfg.Raft` field that operators tune:

  • `RAFT_ENABLED`
  • `RAFT_SELF_ID`
  • `RAFT_BIND_ADDR` (e.g. `:7000`)
  • `RAFT_BOOTSTRAP`
  • `RAFT_PEERS` — `"id=host:port,id=host:port,..."` (same shape as `CLUSTER_NODES`)
  • `RAFT_HEARTBEAT_MS`, `RAFT_ELECTION_MS`, `RAFT_LEADER_LEASE_MS`, `RAFT_COMMIT_MS`, `RAFT_APPLY_TIMEOUT_SECONDS`

Malformed peer entries (no `=`, empty id, etc.) are silently dropped so a typo doesn't crash startup.

Compose template

New `deploy/docker-compose/raft-cluster/{compose.yaml,README.md}`:

  • 3 codeq services on host ports 8080/8081/8082
  • YAML anchor shares the common env; only `RAFT_SELF_ID`, `RAFT_BIND_ADDR`, `RAFT_BOOTSTRAP` differ per node
  • node-a bootstraps; node-b/c `depends_on` it so its transport is listening before they try to join
  • Per-node Pebble + artifacts volumes (no shared state)
  • Bridge network with docker DNS for peer resolution

README walks through quick-start, status endpoint probing, failover testing, and the path to multi-shard raft.

Test plan

  • `TestLoadConfigOptional_RaftEnvOverrides` — every RAFT_* env var lands on the right field
  • `TestLoadConfigOptional_RaftDisabledByDefault` — no env means raft stays off
  • `TestLoadConfigOptional_RaftPeersIgnoresMalformed` — malformed entries drop without erroring
  • `go build ./...` clean; existing config + app tests pass
  • Manual: `docker compose -f deploy/docker-compose/raft-cluster/compose.yaml up -d` + verify `/v1/codeq/raft/status` on all three nodes (requires building the `codeq-service:cluster` image first)

🤖 Generated with Claude Code

Operationally, raft was hard to deploy: cfg.Raft was YAML-only, and
deploy/docker-compose had no template for a multi-node raft cluster.
This commit closes both gaps.

### RAFT_* env var bindings

pkg/config/config.go gains overrides for every cfg.Raft field that
operators tune:

- RAFT_ENABLED        (bool)
- RAFT_SELF_ID        (string)
- RAFT_BIND_ADDR      (string, e.g. ":7000")
- RAFT_BOOTSTRAP      (bool)
- RAFT_PEERS          ("id=host:port,id=host:port,...")
- RAFT_HEARTBEAT_MS, RAFT_ELECTION_MS, RAFT_LEADER_LEASE_MS,
  RAFT_COMMIT_MS, RAFT_APPLY_TIMEOUT_SECONDS (int)

RAFT_PEERS uses the same "id=addr,id=addr" shape as CLUSTER_NODES so
manifests look uniform. Malformed entries (no '=', empty id, etc.)
are silently dropped.

### Compose template

New deploy/docker-compose/raft-cluster/{compose.yaml,README.md}:

- 3 codeq services (node-a/b/c) on host ports 8080/8081/8082
- All three share a YAML anchor for the common env; only RAFT_SELF_ID,
  RAFT_BIND_ADDR, and RAFT_BOOTSTRAP differ
- node-a has Bootstrap=true; the others wait via `depends_on` so
  node-a's transport is listening before they try to join
- Per-node Pebble + artifacts volumes (no shared state)
- Bridge network with docker DNS resolution between peers

README walks through quick-start, status endpoint probing, failover
testing (kill leader, wait ~2s, verify new leader on a survivor), and
the path to multi-shard raft via PERSISTENCE_CONFIG.numShards.

### Tests

- TestLoadConfigOptional_RaftEnvOverrides: every RAFT_* env var lands
  on the right cfg.Raft field.
- TestLoadConfigOptional_RaftDisabledByDefault: no env means
  cfg.Raft.Enabled stays false.
- TestLoadConfigOptional_RaftPeersIgnoresMalformed: malformed peer
  entries are dropped without failing the load.
@osvaldoandrade osvaldoandrade merged commit 119cbfa into main May 18, 2026
2 checks passed
@osvaldoandrade osvaldoandrade deleted the feat/raft-compose-template branch May 18, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant