Skip to content

feat(runtime): cell-level cached execution lifecycle#9895

Draft
dmadisetti wants to merge 1 commit into
dm/lazy-store-mergefrom
dm/cached-cells
Draft

feat(runtime): cell-level cached execution lifecycle#9895
dmadisetti wants to merge 1 commit into
dm/lazy-store-mergefrom
dm/cached-cells

Conversation

@dmadisetti

@dmadisetti dmadisetti commented Jun 15, 2026

Copy link
Copy Markdown
Member

Summary

This PR introduces the CachedLifecycle which can be turned on via [marimo.tools.runtime.cache_cells].

When the Lifecycle is active, all executed code is attempted to be cache. Stub hydration occurs when a cell must be executed. When stub hydration fails (e.g. an Unhashable stub), this code change allows for cell rescheduling, forcing a rerun.

  • Followups include intelligent caching (i.e. do no cache cases where the caching process would become slow)
  • Surfacing caching information to the user

@vercel

vercel Bot commented Jun 15, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Jun 24, 2026 11:22pm

Request Review

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 16 files

Architecture diagram
sequenceDiagram
    participant Frontend as Frontend (Browser)
    participant Kernel as Kernel Runtime
    participant RR as Runner/RunQueue
    participant Sched as SequentialScheduler
    participant Eval as Evaluator
    participant CL as CachedLifecycle
    participant Store as LazyStore (Cache)
    participant SIG as SIGINT Handler

    Note over Frontend,SIG: Cell-level Cached Execution Lifecycle

    Frontend->>Kernel: Run cell(s) request
    Kernel->>RR: run_all()
    RR->>RR: Build Runner with CachedLifecycle (if cell_caching enabled)
    RR->>Sched: async with scheduler (publishes to context)
    Note over Sched,Kernel: Scheduler published as KernelRuntimeContext._active_scheduler
    alt Cache MISS (first run or code changed)
        RR->>Sched: pop_cell()
        Sched-->>RR: cell_id
        RR->>Eval: evaluate cell
        Eval->>CL: setup(cell, glbls)
        CL->>CL: hash cell + lookup cache
        CL->>Store: Check cache key
        Store-->>CL: MISS
        Note over CL: Pre-flight check: any ref in glbls is UnhashableStub?
        alt Stub found in refs
            CL->>CL: Invalidate producer manifest
            CL->>CL: Clear own attempt record
            CL->>RR: raise MarimoCancelCellError(producers ∪ self)
            RR->>Sched: requeue_for_rerun(producers ∪ cell)
            Sched->>Sched: Topological sort, prepend to queue
            Note over Sched: Producers run first next cycle
            RR->>Eval: Skip cell execution this turn
        else No stubs
            CL-->>Eval: return None (run body)
            Eval->>Eval: Execute cell body
            Eval->>CL: teardown(cell, glbls, run_result)
            CL->>CL: On success: save result to cache
            CL->>Store: save_cache(attempt)
            Store-->>CL: Saved
            CL-->>Eval: Complete
            Eval-->>RR: RunResult
        end
    else Cache HIT (same code, no UI elements restored)
        RR->>Sched: pop_cell()
        Sched-->>RR: cell_id
        RR->>Eval: evaluate cell
        Eval->>CL: setup(cell, glbls)
        CL->>Store: Check cache key
        Store-->>CL: HIT
        CL->>CL: restore(glbls) from cache
        alt Restored UI Element detected
            Note over CL: UI elements need fresh session IDs
            CL->>CL: Discard attempt, fall through to body execution
            CL-->>Eval: return None (run body)
            Eval->>Eval: Execute cell body
            Eval->>CL: teardown (backfill cache)
        else No UI elements
            CL-->>Eval: return Skip(cached_result)
            Note over Eval: Body short-circuited
            Eval-->>RR: RunResult (from cache)
        end
    end

    Note over Frontend,SIG: Updated SIGINT Routing

    alt Async cell in flight (scheduler active tasks)
        SIG->>SIG: Signal received
        SIG->>SIG: Check active_scheduler via safe_get_context()
        SIG->>Sched: has_active_tasks() = true
        SIG->>Sched: cancel_all()
        Sched->>Sched: Set interrupted flag
        Sched->>Sched: call_soon_threadsafe(task.cancel) for each task
        Note over SIG: Return without raising (avoids asyncio internals)
    else Sync cell or between cells (scheduler active, no async tasks)
        SIG->>SIG: Signal received
        SIG->>SIG: Check active_scheduler via safe_get_context()
        SIG->>Sched: has_active_tasks() = false
        SIG->>Sched: cancel_all()
        Sched->>Sched: Set interrupted flag
        SIG->>Kernel: raise MarimoInterrupt (KeyboardInterrupt)
    else No scheduler, no execution context
        SIG->>SIG: Return silently
    end
Loading

Re-trigger cubic

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new CachedLifecycle to enable cell-level cached execution within the runtime’s lifecycle framework, including a soft-cancel/requeue mechanism for stale “unhashable” placeholders and loader support for manifest-only markers when serialization fails.

Changes:

  • Introduces CachedLifecycle to restore cached defs/return values on hit and persist cache on successful miss.
  • Updates the lazy cache loader/schema to mark unserializable defs in the manifest (unserializable_type) and reconstruct UnhashableStub on load without writing blobs.
  • Extends the runner/scheduler to support soft-cancel requeue (MarimoCancelCellErrorrequeue_for_rerun) and adds targeted test coverage.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/_save/stubs/test_unhashable_stub.py Adds unit tests for UnhashableStub type name override + from_item reconstruction.
tests/_save/loaders/test_loader.py Adds integration test ensuring unserializable defs write no blob and reload as UnhashableStub.
tests/_runtime/test_cached_stage.py Adds runtime integration tests for cached lifecycle hit/miss and stub-driven requeue behavior.
marimo/_save/stubs/lazy_stub.py Extends Item schema with unserializable_type; updates UnhashableStub to accept explicit type_name.
marimo/_save/loaders/lazy.py Implements manifest-only marking on serialization failure; reconstructs UnhashableStub from manifest markers.
marimo/_runtime/runner/scheduler.py Adds requeue_for_rerun to prepend cells in topological order for soft-cancel retries.
marimo/_runtime/runner/hook_context.py Adds CancelledCells.discard() to “un-cancel” cells when requeued.
marimo/_runtime/runner/cell_runner.py Wires CachedLifecycle via config and propagates/handles MarimoCancelCellError to requeue cells.
marimo/_runtime/executor/lifecycles/cached.py New lifecycle implementing cache hit skip, miss persistence, and stub-ref preflight invalidation.

Comment thread marimo/_runtime/runner/scheduler.py Outdated
Comment thread marimo/_runtime/runner/cell_runner.py
Comment thread marimo/_save/loaders/lazy.py Outdated
Add CachedLifecycle: an executor lifecycle that skips a cell's body on a
cache hit and backfills its defs on a miss. Detection of upstream
unserializable defs is done at use-site via a duck-typed
`__marimo_unhashable__` marker check (no import coupling to the
serialization toolkit), and a pre-flight ref scan requeues the producing
cells via a soft MarimoCancelCellError + Scheduler.requeue_for_rerun
rather than hard-failing.

Stacked on the LazyStore dual-mode backend (#9898): relies on that PR's
mark-don't-write mechanism so a cell's own unserializable def restores as
an UnhashableStub tripwire instead of raising a PicklingError.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment on lines +84 to +91
self._graph = graph
self._pin_modules = pin_modules
# The persistent loaders are all BasePersistenceLoader subclasses
# (which carry `.store`); the registry is typed as the `Loader` base.
self._loader = cast(
BasePersistenceLoader,
resolve_loader(PERSISTENT_LOADERS[loader])(name="lazy"),
)
Comment on lines +112 to +119
if attempt.hit:
try:
attempt.restore(glbls)
except Exception as e:
LOGGER.warning("Cache restore failed for %s: %s", cell_id, e)
self._attempts.pop(cell_id, None)
# Fall through to miss-path execution.
else:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants