Skip to content

fix(web): drop stale queued-message ghosts on at-bottom refresh#811

Open
flyinghanger wants to merge 2 commits into
tiann:mainfrom
flyinghanger:fix/queued-message-ghosts
Open

fix(web): drop stale queued-message ghosts on at-bottom refresh#811
flyinghanger wants to merge 2 commits into
tiann:mainfrom
flyinghanger:fix/queued-message-ghosts

Conversation

@flyinghanger
Copy link
Copy Markdown
Contributor

Problem

Two old queued-message cards stayed pinned above the composer forever and could never be dismissed. Investigation showed the hub DB had zero rows with `invoked_at IS NULL` — the cards were pure client-side residue, not real queued work.

Root cause

A queued user message (`invokedAt === null`) is only ever cleared by the live `messages-consumed` SSE (which flips `invokedAt` via `markMessagesConsumed`). That event is one-shot. If the client is offline/closed when the CLI consumes the message, the signal is lost permanently. On reload:

  1. The row is restored from `sessionStorage` still flagged queued (`invokedAt: null`).
  2. The server's now-invoked copy is too old to appear in the latest fetch window, so `mergeMessages` never corrects it.
  3. `trimPreservingQueued` deliberately never evicts queued rows — so it's pinned forever as a phantom card.

Fix

Reconcile the rebuilt window against the authoritative latest page in `fetchLatestMessages` (at-bottom path). An immediate, server-echoed, locally-queued message whose id is absent from the server response is a ghost and is dropped.

False-positive guards (kept even when absent from the response):

  • optimistic rows (`id === localId`) — the server echo may still be in flight; `mergeMessages` owns their reconciliation.
  • scheduled rows (`scheduledAt != null`) — the hub omits not-yet-mature scheduled messages from `getMessages`, so absence is expected.

A genuinely-still-queued immediate message sorts by `createdAt` to the top of the latest page, so it is always present in the fetched window and never dropped.

Tests

  • New integration test reproducing the ghost + its removal on at-bottom refresh.
  • New pure-function test for `reconcileQueuedAgainstLatest` covering all three keep-guards + the drop case.
  • Full web suite: 696 passed; `tsc --noEmit` clean.

A queued user message (invokedAt === null) is only cleared by the live
messages-consumed SSE. If the client is offline/closed when the CLI
consumes it, that one-shot event is lost; on reload the row is restored
from sessionStorage still flagged queued, the server's invoked copy is
too old to appear in the latest window so mergeMessages never corrects
it, and trimPreservingQueued pins it forever as a phantom card above the
composer.

Reconcile the rebuilt window against the authoritative latest page in
fetchLatestMessages: an immediate, server-echoed, locally-queued message
absent from the server response is a ghost and is dropped. Optimistic
rows (echo may be in flight) and scheduled rows (hub omits not-yet-mature
messages) are kept.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Stale latest-page responses can drop real queued rows. The reconciliation runs against prev.messages at apply time but uses response.messages from the older HTTP snapshot, so a message-received SSE that arrives while fetchLatestMessages is in flight can add a server-echoed queued row (id !== localId) that is absent from that snapshot and gets filtered as a ghost.

Summary

  • Review mode: initial
  • One correctness issue found in the queued-message reconciliation path. Residual risk: this race is timing-dependent, so it needs a generation/snapshot test around concurrent fetchLatestMessages + SSE ingestion.

Testing

  • Not run (automation)

HAPI Bot

Comment thread web/src/lib/message-window-store.ts Outdated
// trimVisible preserves every queued row, so a ghost (queued locally
// but already invoked server-side, missed messages-consumed while
// offline) would otherwise be pinned forever.
const reconciled = reconcileQueuedAgainstLatest(merged, response.messages)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Major] reconcileQueuedAgainstLatest compares the current window to the response from the request's start. fetchLatestMessages captures initial before awaiting api.getMessages, but the updater later reads the latest prev.messages; if a message-received SSE lands in between, ingestIncomingMessages can replace the optimistic row with a server-echoed queued row (id !== localId). That row is absent from the older response.messages, so this line filters it out as a ghost and the queued bar loses real work.

Suggested fix:

const queuedIdsAtRequestStart = new Set(
    [...initial.messages, ...initial.pending]
        .filter((msg) => isQueuedForInvocation(msg) && msg.scheduledAt == null && !isOptimisticMessage(msg))
        .map((msg) => msg.id)
)

const reconciled = reconcileQueuedAgainstLatest(merged, response.messages, queuedIdsAtRequestStart)

and keep newly-arrived queued rows in the helper:

if (!queuedIdsAtRequestStart.has(msg.id)) return true

Review (PR tiann#811): reconcileQueuedAgainstLatest ran against the latest
page snapshot captured at the fetch request's start, but the reconcile
itself executes in the updater against the live window. A message-received
SSE landing while fetchLatestMessages was in flight could insert a real
server-echoed queued row (id != localId) absent from that older snapshot,
which was then filtered out as a ghost — dropping genuine queued work.

Capture the immediate, server-echoed queued ids at request start and only
reconcile those; rows that arrived after the fetch was issued are exempt.
@flyinghanger
Copy link
Copy Markdown
Contributor Author

Thanks — confirmed and fixed in 8ca2c11.

The race was real: reconcileQueuedAgainstLatest executed inside the updater against the live window, but response.messages is the snapshot from the request's start. A message-received SSE landing mid-fetch could insert a real server-echoed queued row (id !== localId) that the snapshot never saw, and it got filtered as a ghost.

Fix follows the suggested approach: capture the immediate, server-echoed queued ids at request start (queuedReconcileCandidateIds(initial.messages, initial.pending)) and gate the helper on them — only rows already queued when the fetch was issued are reconcile candidates; anything that arrives mid-fetch is exempt.

Added a regression test (keeps a queued row that arrives via SSE while the latest fetch is in flight) that resolves the HTTP fetch via a deferred and ingests a fresh queued row in the gap, asserting the ghost is dropped while the fresh row survives. Full web suite green (665).

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings
No findings.

Summary
Review mode: follow-up after new commits
No issues found in the latest diff. Residual risk: web store behavior remains timing-sensitive around concurrent HTTP refresh and SSE updates, but the added regression coverage targets the prior bot finding.

Testing
Not run (automation: bun is not installed in this runner PATH)

HAPI Bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant