feat(runtime): cell-level cached execution lifecycle#9895
Draft
dmadisetti wants to merge 1 commit into
Draft
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
There was a problem hiding this comment.
No issues found across 16 files
Architecture diagram
sequenceDiagram
participant Frontend as Frontend (Browser)
participant Kernel as Kernel Runtime
participant RR as Runner/RunQueue
participant Sched as SequentialScheduler
participant Eval as Evaluator
participant CL as CachedLifecycle
participant Store as LazyStore (Cache)
participant SIG as SIGINT Handler
Note over Frontend,SIG: Cell-level Cached Execution Lifecycle
Frontend->>Kernel: Run cell(s) request
Kernel->>RR: run_all()
RR->>RR: Build Runner with CachedLifecycle (if cell_caching enabled)
RR->>Sched: async with scheduler (publishes to context)
Note over Sched,Kernel: Scheduler published as KernelRuntimeContext._active_scheduler
alt Cache MISS (first run or code changed)
RR->>Sched: pop_cell()
Sched-->>RR: cell_id
RR->>Eval: evaluate cell
Eval->>CL: setup(cell, glbls)
CL->>CL: hash cell + lookup cache
CL->>Store: Check cache key
Store-->>CL: MISS
Note over CL: Pre-flight check: any ref in glbls is UnhashableStub?
alt Stub found in refs
CL->>CL: Invalidate producer manifest
CL->>CL: Clear own attempt record
CL->>RR: raise MarimoCancelCellError(producers ∪ self)
RR->>Sched: requeue_for_rerun(producers ∪ cell)
Sched->>Sched: Topological sort, prepend to queue
Note over Sched: Producers run first next cycle
RR->>Eval: Skip cell execution this turn
else No stubs
CL-->>Eval: return None (run body)
Eval->>Eval: Execute cell body
Eval->>CL: teardown(cell, glbls, run_result)
CL->>CL: On success: save result to cache
CL->>Store: save_cache(attempt)
Store-->>CL: Saved
CL-->>Eval: Complete
Eval-->>RR: RunResult
end
else Cache HIT (same code, no UI elements restored)
RR->>Sched: pop_cell()
Sched-->>RR: cell_id
RR->>Eval: evaluate cell
Eval->>CL: setup(cell, glbls)
CL->>Store: Check cache key
Store-->>CL: HIT
CL->>CL: restore(glbls) from cache
alt Restored UI Element detected
Note over CL: UI elements need fresh session IDs
CL->>CL: Discard attempt, fall through to body execution
CL-->>Eval: return None (run body)
Eval->>Eval: Execute cell body
Eval->>CL: teardown (backfill cache)
else No UI elements
CL-->>Eval: return Skip(cached_result)
Note over Eval: Body short-circuited
Eval-->>RR: RunResult (from cache)
end
end
Note over Frontend,SIG: Updated SIGINT Routing
alt Async cell in flight (scheduler active tasks)
SIG->>SIG: Signal received
SIG->>SIG: Check active_scheduler via safe_get_context()
SIG->>Sched: has_active_tasks() = true
SIG->>Sched: cancel_all()
Sched->>Sched: Set interrupted flag
Sched->>Sched: call_soon_threadsafe(task.cancel) for each task
Note over SIG: Return without raising (avoids asyncio internals)
else Sync cell or between cells (scheduler active, no async tasks)
SIG->>SIG: Signal received
SIG->>SIG: Check active_scheduler via safe_get_context()
SIG->>Sched: has_active_tasks() = false
SIG->>Sched: cancel_all()
Sched->>Sched: Set interrupted flag
SIG->>Kernel: raise MarimoInterrupt (KeyboardInterrupt)
else No scheduler, no execution context
SIG->>SIG: Return silently
end
b0930ff to
5209b61
Compare
f0ff2e3 to
5cab29a
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new CachedLifecycle to enable cell-level cached execution within the runtime’s lifecycle framework, including a soft-cancel/requeue mechanism for stale “unhashable” placeholders and loader support for manifest-only markers when serialization fails.
Changes:
- Introduces
CachedLifecycleto restore cached defs/return values on hit and persist cache on successful miss. - Updates the lazy cache loader/schema to mark unserializable defs in the manifest (
unserializable_type) and reconstructUnhashableStubon load without writing blobs. - Extends the runner/scheduler to support soft-cancel requeue (
MarimoCancelCellError→requeue_for_rerun) and adds targeted test coverage.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/_save/stubs/test_unhashable_stub.py | Adds unit tests for UnhashableStub type name override + from_item reconstruction. |
| tests/_save/loaders/test_loader.py | Adds integration test ensuring unserializable defs write no blob and reload as UnhashableStub. |
| tests/_runtime/test_cached_stage.py | Adds runtime integration tests for cached lifecycle hit/miss and stub-driven requeue behavior. |
| marimo/_save/stubs/lazy_stub.py | Extends Item schema with unserializable_type; updates UnhashableStub to accept explicit type_name. |
| marimo/_save/loaders/lazy.py | Implements manifest-only marking on serialization failure; reconstructs UnhashableStub from manifest markers. |
| marimo/_runtime/runner/scheduler.py | Adds requeue_for_rerun to prepend cells in topological order for soft-cancel retries. |
| marimo/_runtime/runner/hook_context.py | Adds CancelledCells.discard() to “un-cancel” cells when requeued. |
| marimo/_runtime/runner/cell_runner.py | Wires CachedLifecycle via config and propagates/handles MarimoCancelCellError to requeue cells. |
| marimo/_runtime/executor/lifecycles/cached.py | New lifecycle implementing cache hit skip, miss persistence, and stub-ref preflight invalidation. |
da54d77 to
c872d31
Compare
aa9cebc to
1e5abde
Compare
6cb26c5 to
45fbfde
Compare
Add CachedLifecycle: an executor lifecycle that skips a cell's body on a cache hit and backfills its defs on a miss. Detection of upstream unserializable defs is done at use-site via a duck-typed `__marimo_unhashable__` marker check (no import coupling to the serialization toolkit), and a pre-flight ref scan requeues the producing cells via a soft MarimoCancelCellError + Scheduler.requeue_for_rerun rather than hard-failing. Stacked on the LazyStore dual-mode backend (#9898): relies on that PR's mark-don't-write mechanism so a cell's own unserializable def restores as an UnhashableStub tripwire instead of raising a PicklingError.
1e5abde to
325afe6
Compare
Comment on lines
+84
to
+91
| self._graph = graph | ||
| self._pin_modules = pin_modules | ||
| # The persistent loaders are all BasePersistenceLoader subclasses | ||
| # (which carry `.store`); the registry is typed as the `Loader` base. | ||
| self._loader = cast( | ||
| BasePersistenceLoader, | ||
| resolve_loader(PERSISTENT_LOADERS[loader])(name="lazy"), | ||
| ) |
Comment on lines
+112
to
+119
| if attempt.hit: | ||
| try: | ||
| attempt.restore(glbls) | ||
| except Exception as e: | ||
| LOGGER.warning("Cache restore failed for %s: %s", cell_id, e) | ||
| self._attempts.pop(cell_id, None) | ||
| # Fall through to miss-path execution. | ||
| else: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces the
CachedLifecyclewhich can be turned on via[marimo.tools.runtime.cache_cells].When the Lifecycle is active, all executed code is attempted to be cache. Stub hydration occurs when a cell must be executed. When stub hydration fails (e.g. an Unhashable stub), this code change allows for cell rescheduling, forcing a rerun.