Add state backup/restore command for droplet migration#202
Add state backup/restore command for droplet migration#202benvinegar wants to merge 4 commits intomainfrom
Conversation
Greptile SummaryThis PR adds a Key observations:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
actor Operator
participant CLI as baudbot CLI
participant state.sh
participant Python as Python3
participant FS as Agent Home (~/.pi / ~/.config)
Note over Operator,FS: Backup flow (old droplet)
Operator->>CLI: sudo baudbot state backup /tmp/state.zip
CLI->>state.sh: exec (requires root)
state.sh->>state.sh: resolve_archive_path()
state.sh->>FS: copy_path_if_present() for each STATE_PATH
state.sh->>state.sh: write_metadata_file() → metadata.json
state.sh->>Python: create_zip_archive(baudbot-state/ → state.zip)
state.sh->>FS: chmod 600 state.zip
state.sh-->>Operator: ✓ state backup created
Note over Operator,FS: Restore flow (new droplet)
Operator->>CLI: sudo baudbot stop
Operator->>CLI: sudo baudbot state restore /tmp/state.zip
CLI->>state.sh: exec (requires root)
state.sh->>state.sh: service_running() → must be stopped
state.sh->>Python: extract_zip_archive_safe() (path-traversal checked)
state.sh->>Python: validate metadata.json format == "baudbot-state-v1"
state.sh->>FS: cp -a payload/. → BAUDBOT_AGENT_HOME/ (merge)
state.sh->>FS: restore_ownership_if_root() chown -R
state.sh->>FS: restore_secure_permissions() chmod 600
state.sh-->>Operator: ✓ state restored
Operator->>CLI: sudo baudbot start
Prompt To Fix All With AIThis is a comment left during a code review.
Path: bin/baudbot.test.sh
Line: 154-159
Comment:
**Hardcoded temp file path causes test isolation failure**
The test declares `local out` (line 136) but then redirects command output to the hardcoded global path `/tmp/baudbot-state.out` instead of using the already-available `$tmp` directory. If two test processes run concurrently (or a previous run left the file behind), they will clobber each other's output and the grep on line 160 could read stale data from a prior run.
```suggestion
if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >"$tmp/state.out" 2>&1; then
return 1
fi
out="$(cat "$tmp/state.out")"
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: bin/state.sh
Line: 120-129
Comment:
**Shell variables interpolated into JSON without escaping**
`$host_name`, `$BAUDBOT_AGENT_USER`, and `$BAUDBOT_AGENT_HOME` are embedded directly into the JSON heredoc. While typical Linux hostnames and usernames don't contain JSON-special characters, `$BAUDBOT_AGENT_HOME` is a file-system path and could (in unusual environments) contain a backslash or other character that would produce malformed JSON. When `cmd_restore` later calls `json.load()` on this file, a parse error will cause a hard failure with a Python traceback rather than a clear diagnostic message.
Consider using Python to write the metadata instead, or at minimum `printf`-escaping the values:
```bash
python3 - "$metadata_file" "$STATE_FORMAT" "$now" "$host_name" "$BAUDBOT_AGENT_USER" "$BAUDBOT_AGENT_HOME" "$include_secrets" <<'PY'
import json, sys
path, fmt, created, host, user, home, secrets = sys.argv[1:]
with open(path, "w") as f:
json.dump({"format": fmt, "created_at": created, "host": host,
"agent_user": user, "agent_home": home,
"include_secrets": secrets == "1"}, f, indent=2)
f.write("\n")
PY
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: bin/state.sh
Line: 374-375
Comment:
**Restore merges into agent home rather than replacing it**
`cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"` merges the archive contents on top of whatever already exists in the agent home. Any files present on the new droplet that are not in the archive (e.g., a fresh `settings.json` written by `deploy`) are silently kept alongside the restored files. This can lead to an inconsistent state that is difficult to diagnose.
The docs describe the operation as a "restore" which implies replacement. If merge semantics are intentional, at least surface a warning to the operator:
```bash
# Warn if the agent home already has content
if [ -n "$(ls -A "$BAUDBOT_AGENT_HOME" 2>/dev/null)" ]; then
bb_warn "agent home is not empty; existing files not in the archive will be preserved"
fi
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 45e3840 |
bin/baudbot.test.sh
Outdated
| if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >/tmp/baudbot-state.out 2>&1; then | ||
| return 1 | ||
| fi | ||
|
|
||
| out="$(cat /tmp/baudbot-state.out)" | ||
| rm -f /tmp/baudbot-state.out |
There was a problem hiding this comment.
Hardcoded temp file path causes test isolation failure
The test declares local out (line 136) but then redirects command output to the hardcoded global path /tmp/baudbot-state.out instead of using the already-available $tmp directory. If two test processes run concurrently (or a previous run left the file behind), they will clobber each other's output and the grep on line 160 could read stale data from a prior run.
| if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >/tmp/baudbot-state.out 2>&1; then | |
| return 1 | |
| fi | |
| out="$(cat /tmp/baudbot-state.out)" | |
| rm -f /tmp/baudbot-state.out | |
| if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >"$tmp/state.out" 2>&1; then | |
| return 1 | |
| fi | |
| out="$(cat "$tmp/state.out")" |
Prompt To Fix With AI
This is a comment left during a code review.
Path: bin/baudbot.test.sh
Line: 154-159
Comment:
**Hardcoded temp file path causes test isolation failure**
The test declares `local out` (line 136) but then redirects command output to the hardcoded global path `/tmp/baudbot-state.out` instead of using the already-available `$tmp` directory. If two test processes run concurrently (or a previous run left the file behind), they will clobber each other's output and the grep on line 160 could read stale data from a prior run.
```suggestion
if PATH="$fakebin:$PATH" BAUDBOT_ROOT="$REPO_ROOT" bash "$CLI" state backup >"$tmp/state.out" 2>&1; then
return 1
fi
out="$(cat "$tmp/state.out")"
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in commit c0ccf59. I changed the test to write/read output via the per-test temp directory instead of a shared /tmp file. I applied the same isolation cleanup pattern to the nearby debug/broker root-requirement tests as well.
Responded by baudbot-dev-agent using openai/gpt-5.
bin/state.sh
Outdated
| cat > "$metadata_file" <<EOF | ||
| { | ||
| "format": "$STATE_FORMAT", | ||
| "created_at": "$now", | ||
| "host": "$host_name", | ||
| "agent_user": "$BAUDBOT_AGENT_USER", | ||
| "agent_home": "$BAUDBOT_AGENT_HOME", | ||
| "include_secrets": $([ "$include_secrets" = "1" ] && echo true || echo false) | ||
| } | ||
| EOF |
There was a problem hiding this comment.
Shell variables interpolated into JSON without escaping
$host_name, $BAUDBOT_AGENT_USER, and $BAUDBOT_AGENT_HOME are embedded directly into the JSON heredoc. While typical Linux hostnames and usernames don't contain JSON-special characters, $BAUDBOT_AGENT_HOME is a file-system path and could (in unusual environments) contain a backslash or other character that would produce malformed JSON. When cmd_restore later calls json.load() on this file, a parse error will cause a hard failure with a Python traceback rather than a clear diagnostic message.
Consider using Python to write the metadata instead, or at minimum printf-escaping the values:
python3 - "$metadata_file" "$STATE_FORMAT" "$now" "$host_name" "$BAUDBOT_AGENT_USER" "$BAUDBOT_AGENT_HOME" "$include_secrets" <<'PY'
import json, sys
path, fmt, created, host, user, home, secrets = sys.argv[1:]
with open(path, "w") as f:
json.dump({"format": fmt, "created_at": created, "host": host,
"agent_user": user, "agent_home": home,
"include_secrets": secrets == "1"}, f, indent=2)
f.write("\n")
PYPrompt To Fix With AI
This is a comment left during a code review.
Path: bin/state.sh
Line: 120-129
Comment:
**Shell variables interpolated into JSON without escaping**
`$host_name`, `$BAUDBOT_AGENT_USER`, and `$BAUDBOT_AGENT_HOME` are embedded directly into the JSON heredoc. While typical Linux hostnames and usernames don't contain JSON-special characters, `$BAUDBOT_AGENT_HOME` is a file-system path and could (in unusual environments) contain a backslash or other character that would produce malformed JSON. When `cmd_restore` later calls `json.load()` on this file, a parse error will cause a hard failure with a Python traceback rather than a clear diagnostic message.
Consider using Python to write the metadata instead, or at minimum `printf`-escaping the values:
```bash
python3 - "$metadata_file" "$STATE_FORMAT" "$now" "$host_name" "$BAUDBOT_AGENT_USER" "$BAUDBOT_AGENT_HOME" "$include_secrets" <<'PY'
import json, sys
path, fmt, created, host, user, home, secrets = sys.argv[1:]
with open(path, "w") as f:
json.dump({"format": fmt, "created_at": created, "host": host,
"agent_user": user, "agent_home": home,
"include_secrets": secrets == "1"}, f, indent=2)
f.write("\n")
PY
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Agreed and fixed in commit c0ccf59. Metadata JSON is now written via Python json.dump instead of shell heredoc interpolation, so values are correctly escaped before restore-time parsing.
Responded by baudbot-dev-agent using openai/gpt-5.
| mkdir -p "$BAUDBOT_AGENT_HOME" | ||
| cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/" |
There was a problem hiding this comment.
Restore merges into agent home rather than replacing it
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/" merges the archive contents on top of whatever already exists in the agent home. Any files present on the new droplet that are not in the archive (e.g., a fresh settings.json written by deploy) are silently kept alongside the restored files. This can lead to an inconsistent state that is difficult to diagnose.
The docs describe the operation as a "restore" which implies replacement. If merge semantics are intentional, at least surface a warning to the operator:
# Warn if the agent home already has content
if [ -n "$(ls -A "$BAUDBOT_AGENT_HOME" 2>/dev/null)" ]; then
bb_warn "agent home is not empty; existing files not in the archive will be preserved"
fi
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"Prompt To Fix With AI
This is a comment left during a code review.
Path: bin/state.sh
Line: 374-375
Comment:
**Restore merges into agent home rather than replacing it**
`cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"` merges the archive contents on top of whatever already exists in the agent home. Any files present on the new droplet that are not in the archive (e.g., a fresh `settings.json` written by `deploy`) are silently kept alongside the restored files. This can lead to an inconsistent state that is difficult to diagnose.
The docs describe the operation as a "restore" which implies replacement. If merge semantics are intentional, at least surface a warning to the operator:
```bash
# Warn if the agent home already has content
if [ -n "$(ls -A "$BAUDBOT_AGENT_HOME" 2>/dev/null)" ]; then
bb_warn "agent home is not empty; existing files not in the archive will be preserved"
fi
cp -a "$payload_root/." "$BAUDBOT_AGENT_HOME/"
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Good point. Merge semantics are intentional so restore only overlays managed state paths, but I added an explicit warning when the agent home is non-empty before copying archive contents (commit c0ccf59). That makes the behavior visible to operators.
Responded by baudbot-dev-agent using openai/gpt-5.
|
Correction to my previous note (shell ate inline code formatting): Addressed all review comments and pushed fixes:
Also fixed CI breakages discovered in integration runs:
PR checks are now green, including arch + ubuntu integration. |
| os.makedirs(archive_dir, exist_ok=True) | ||
|
|
||
| with zipfile.ZipFile(archive_path, "w", compression=zipfile.ZIP_DEFLATED) as zip_file: |
There was a problem hiding this comment.
Bug: The state restore process creates directories with default permissions (e.g., 755) instead of the required secure permissions (700), exposing sensitive data to other group users.
Severity: CRITICAL
Suggested Fix
Modify the backup process to explicitly write directory entries to the zip archive, preserving their original permissions. Alternatively, update the restore_secure_permissions function to recursively apply the correct 700 permissions to the ~/.pi directory and its critical subdirectories after the archive is extracted.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: bin/state.sh#L157-L159
Potential issue: The state restore process creates directories with incorrect
permissions. The backup function `create_zip_archive` does not save directory metadata.
During restore, `extract_zip_archive_safe` creates directories using `os.makedirs()`,
which applies default permissions (e.g., `755`) based on the system's umask. The
subsequent `restore_secure_permissions` function only corrects permissions for specific
files, not directories. As a result, security-critical directories like `~/.pi` and
`~/.pi/agent` are left with world-readable permissions instead of the required `700`,
allowing other users in the same group to access sensitive agent data.
Summary
baudbot statecommand withbackupandrestoresubcommands~/.config/.env,~/.pi/agent/auth.json) with--exclude-secretsoption for shareable archivesREADME.mdanddocs/operations.mdValidation
bash bin/state.test.shbash bin/baudbot.test.shnpm run lint:shellnpm run test:shellnpm testReal droplet validation (Arch Linux)
bin/ci/droplet.shsudo baudbot state backup /tmp/baudbot-state.zipon old dropletsudo baudbot state restore /tmp/baudbot-state.zip600)