Skip to content

Skip and log unreadable pane trees during tab restore#9979

Draft
amriksingh0786 wants to merge 1 commit intowarpdotdev:masterfrom
amriksingh0786:fix/issue-9109-tab-restoration
Draft

Skip and log unreadable pane trees during tab restore#9979
amriksingh0786 wants to merge 1 commit intowarpdotdev:masterfrom
amriksingh0786:fix/issue-9109-tab-restoration

Conversation

@amriksingh0786
Copy link
Copy Markdown
Contributor

Description

read_sqlite_data was using filter_map(read_root_node(...).ok()?) to assemble the saved tabs of a window, which silently dropped every tab whose pane tree failed to deserialize. After the 2026-04-14-150000_add_code_pane_tabs and 2026-04-17-020439_add_custom_vertical_tabs_title_to_pane_leaves migrations, a single bad leaf (typically an orphan code_panes row) would short-circuit the iterator and cause the user to lose every tab in that window — not just the broken one.

This PR replaces the silent drop with skip-and-log: a log::warn! keyed by the offending tab id, and the iterator continues with the remaining tabs.

let root = match read_root_node(conn, tab_id_for_log) {
    Ok(root) => root,
    Err(err) => {
        log::warn!(
            "Skipping tab id={tab_id_for_log} during restore: failed to read pane tree: {err}"
        );
        return None;
    }
};

No schema change. No new migration. Read-side fix only — safe to roll back via revert.

The fix prevents future tab loss but cannot recover sessions already overwritten by save_app_state on the broken stable_02 build; that data is gone before this code runs.

Fixes #9109.

Testing

Added 3 behavioural tests in app/src/persistence/sqlite_tests.rs:

  • restore_skips_bad_pane_tree_keeps_others — DB with [good, bad, good] tabs returns the two good tabs and emits the warn for the bad one.
  • restore_logs_error_on_bad_blob — the warn-and-skip path runs without panicking.
  • restore_with_all_bad_blobs_returns_empty_not_panic — graceful degradation when every tab is bad: empty list, no panic.

Plus three small fixture helpers (terminal_tab_snapshot, code_tab_snapshot, window_with_tabs) and orphan_all_code_panes to deliberately break pane trees the same way the April migrations did.

./script/presubmit is clean: cargo fmt, cargo clippy --workspace --all-targets --tests -- -D warnings, clang-format, wgslfmt, and the targeted cargo nextest run -p warp 'persistence::sqlite' all green. The 5 shell_integration_tests::*ssh* failures observed locally are unrelated to this change — they require a Docker SSH container — and are skipped on fork PRs per #9304.

Manually verified on macOS: dev build (cargo run) restored an existing 3-tab window cleanly, with no Skipping tab id= warnings (correct behaviour on a healthy DB — that branch only fires on bad blobs).

Agent Mode

  • Warp Agent Mode — This PR was created via Warp's AI Agent Mode

CHANGELOG-BUG-FIX: Tabs no longer disappear on restart when one pane fails to read from the persistence DB; the bad tab is skipped and logged instead.

`read_sqlite_data` used `filter_map(read_root_node(...).ok()?)` so a
single tab whose pane tree failed to deserialize silently dropped every
remaining tab in the same window. The 2026-04-14 and 2026-04-17
migrations made one bad leaf cascade into "all tabs lost" on the broken
stable_02 build.

Now we drop only the offending tab and log a `warn!` keyed by its row
id, so the rest of the window's tabs still restore. No schema or
migration change.

Adds three behavioural tests covering: a bad blob between two good ones
keeps both good tabs; the warn path runs without panicking; and a
window where every blob is bad returns an empty tab list rather than
panicking.

Fixes warpdotdev#9109
@cla-bot cla-bot Bot added the cla-signed label May 3, 2026
@github-actions github-actions Bot added the external-contributor Indicates that a PR has been opened by someone outside the Warp team. label May 3, 2026
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@amriksingh0786

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR rewrites the per-tab read_root_node failure path during SQLite restore to log a warning before skipping that tab, and adds corruption fixtures around missing code_panes rows.

Concerns

  • The implementation does not change the skip behavior claimed by the PR: the old .ok()? inside filter_map already returned None for the current tab and continued to later tabs. The new tests assert that existing behavior, so they do not demonstrate or fix an all-tabs-loss regression.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

log::warn!(
"Skipping tab id={tab_id_for_log} during restore: failed to read pane tree: {err}"
);
return None;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This preserves the previous behavior: .ok()? inside filter_map also returned None for just this tab and continued iteration, so the added behavioral tests would pass before this change. Please add a regression that fails on the old code or fix the actual path that causes all tabs to be lost.

@amriksingh0786
Copy link
Copy Markdown
Contributor Author

@captainsafia @seemeroland, flagging since you both helped land #9503. Would value your call here.

@oz-for-oss you're right. I missed that .ok()? inside filter_map already returns None for just the failing item and continues iteration. The behavioral tests in this PR pass on the old code too, so they don't demonstrate the regression. Apologies for the misanalysis.

Looking at the wider failure surface, the actual all-tabs-loss path is at app/src/persistence/sqlite.rs:164-174: any Err from read_sqlite_data (including from the early ? propagations on windows.load, Tab::belonging_to(...).load, or panels.load at lines 2659-2671) bubbles up to the caller, which sets app_state = None and wipes every window. The April migrations don't directly touch windows/tabs/panels, but a partial-migration or downgrade scenario could leave Diesel-expected columns missing on those tables. The original reporter is on stable_02 specifically; a maintainer on the issue thread also couldn't reproduce, which is consistent with a DB-state-dependent failure.

Without a reproducer DB I'd be guessing at the exact failing query. Two options:

  1. Close this PR as misdiagnosed; the issue stays open for someone with a repro.
  2. Narrow this PR to keep the warn-and-skip as a defensive observability improvement only. Iterator behavior is unchanged, but a future failure leaves a log line keyed by the bad tab id instead of a silent drop. Description, commit message, and changelog get rewritten to be honest about scope: this does not fix Breaking regression: tabs cannot be restored after closing in v0.2026.04.15.08.45.stable_02 #9109; it just makes the next report easier to triage. Issue stays open.

Leaning toward (1) since (2) feels like salvaging a misframed PR. Happy to do either. What would you prefer?

@amriksingh0786 amriksingh0786 marked this pull request as draft May 3, 2026 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed external-contributor Indicates that a PR has been opened by someone outside the Warp team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Breaking regression: tabs cannot be restored after closing in v0.2026.04.15.08.45.stable_02

1 participant