feat: track restore stage in _pgro.restore_info by passcod · Pull Request #25 · beyondessential/postgres-restore-operator

passcod · 2026-04-23T05:07:46Z

Adds stage and last_transition_time columns to _pgro.restore_info.Init script sets stage to restored when a post-restore reindex ispending, or ready otherwise. The background reindex block transitionsto reindexing before starting and ready after completion.

🦸 Review Hero

Run Review Hero
Auto-fix review suggestions
Auto-fix CI failures

Adds `stage` and `last_transition_time` columns to `_pgro.restore_info`. Init script sets stage to `restored` when a post-restore reindex is pending, or `ready` otherwise. The background reindex block transitions to `reindexing` before starting and `ready` after completion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

review-hero · 2026-04-23T05:11:12Z

🦸 Review Hero Summary
6 agents reviewed this PR | 1 critical | 0 suggestions | 0 nitpicks | Filtering: consensus 3 voters, 2 below threshold

Below consensus threshold (2 unique issues not confirmed by majority)

Location	Agent	Severity	Comment
`src/controllers/restore/builders.rs:988`	Bugs & Correctness	`critical`	The `rm -f /pgdata/needs-reindex` executes on line 988 before the `SET stage = 'ready'` psql call on line 989. If the process is killed in that window, the `needs-reindex` file is gone but the stag...
`src/controllers/restore/tests.rs:378`	Bugs & Correctness	`nitpick`	`script.find("REINDEX INDEX").unwrap()` (without `.expect(...)`) will panic with a generic index-out-of-bounds message if the string is absent, making test failures hard to diagnose. Use `.expect("...

Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.

src/controllers/restore/builders.rs:989: The stage = 'ready' UPDATE fires unconditionally after the reindex loop, even when individual REINDEX commands fail. Lines 982 and 984 use || true to suppress failures, so a corrupted/invalid index is silently ignored and the stage still reports 'ready'. A client reading stage='ready' would believe the database is fully usable when it may have invalid indexes. Fix: track failures with a flag variable (e.g. REINDEX_OK=1, clear it on failure) and only update to 'ready' if the flag is still set; otherwise update to a distinct error state or leave as 'reindexing' so the problem is visible.

review-hero · 2026-04-23T05:16:20Z

-      restored_at = now();
+      restored_at = now(),
+      stage = EXCLUDED.stage,
+      last_transition_time = now();


[Bugs & Correctness] suggestion

The ON CONFLICT clause always resets last_transition_time = now() regardless of whether stage actually changed. If the init container re-runs on a pod restart after a restore has already completed (e.g. the same stage value is upserted), last_transition_time gets bumped to the restart time, making it look like a stage transition just occurred. Fix: last_transition_time = CASE WHEN _pgro.restore_info.stage IS DISTINCT FROM EXCLUDED.stage THEN now() ELSE _pgro.restore_info.last_transition_time END

review-hero · 2026-04-23T05:16:20Z

  PG_MAJOR=$(cat /pgdata/pgdata/PG_VERSION)
  (
    while ! pg_isready -q -U postgres -d postgres; do sleep 2; done
+    psql -U postgres -d postgres -c "UPDATE _pgro.restore_info SET stage = 'reindexing', last_transition_time = now() WHERE id = 1;"


[Bugs & Correctness] suggestion

The 'reindexing' and 'ready' UPDATE statements in the postgres container silently affect 0 rows if the _pgro.restore_info row is missing (e.g. the init container's psql heredoc failed partway through — psql does not abort on statement error by default). Stage tracking is then permanently lost with no indication of failure. Consider adding || echo 'WARNING: stage update affected 0 rows' after each UPDATE, or asserting row count via a DO block.

review-hero · 2026-04-23T05:16:21Z

🦸 Review Hero Summary
6 agents reviewed this PR | 0 critical | 2 suggestions | 0 nitpicks | Filtering: consensus 3 voters, 2 below threshold

Below consensus threshold (2 unique issues not confirmed by majority)

Location	Agent	Severity	Comment
`src/controllers/restore/builders.rs:989`	Bugs & Correctness	`suggestion`	Every REINDEX call uses `\|\| true` (lines 982/984) so individual failures are silently swallowed, yet `stage = 'ready'` is set unconditionally after the loop. A database with partially-failed reinde...
`src/controllers/restore/tests.rs:346`	Bugs & Correctness	`suggestion`	The assertion `script.contains("PGRO_STAGE=restored") && script.contains("PGRO_STAGE=ready")` only checks that both strings exist somewhere in the script — it does not verify which branch sets whic...

Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.

src/controllers/restore/builders.rs:814: The ON CONFLICT clause always resets last_transition_time = now() regardless of whether stage actually changed. If the init container re-runs on a pod restart after a restore has already completed (e.g. the same stage value is upserted), last_transition_time gets bumped to the restart time, making it look like a stage transition just occurred. Fix: last_transition_time = CASE WHEN _pgro.restore_info.stage IS DISTINCT FROM EXCLUDED.stage THEN now() ELSE _pgro.restore_info.last_transition_time END

src/controllers/restore/builders.rs:966: The 'reindexing' and 'ready' UPDATE statements in the postgres container silently affect 0 rows if the _pgro.restore_info row is missing (e.g. the init container's psql heredoc failed partway through — psql does not abort on statement error by default). Stage tracking is then permanently lost with no indication of failure. Consider adding || echo 'WARNING: stage update affected 0 rows' after each UPDATE, or asserting row count via a DO block.

passcod and others added 2 commits April 22, 2026 18:34

move to AGENTS.md

60e11a1

passcod changed the title ~~move to AGENTS.md~~ feat: track restore stage in _pgro.restore_info Apr 23, 2026

review-hero Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/controllers/restore/builders.rs

review-hero Bot reviewed Apr 23, 2026

View reviewed changes

passcod merged commit 39d589c into main Apr 23, 2026
33 checks passed

passcod deleted the feat/restore-stage-tracking branch April 23, 2026 05:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: track restore stage in _pgro.restore_info#25

feat: track restore stage in _pgro.restore_info#25
passcod merged 2 commits intomainfrom
feat/restore-stage-tracking

passcod commented Apr 23, 2026 •

edited by github-actions Bot

Loading

Uh oh!

Uh oh!

review-hero Bot commented Apr 23, 2026

Uh oh!

review-hero Bot Apr 23, 2026

Uh oh!

review-hero Bot Apr 23, 2026

Uh oh!

review-hero Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

passcod commented Apr 23, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦸 Review Hero

Uh oh!

Uh oh!

review-hero Bot commented Apr 23, 2026

Uh oh!

review-hero Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

review-hero Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

review-hero Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

passcod commented Apr 23, 2026 •

edited by github-actions Bot

Loading