Add per-session LRU visualization caching by heaven-howard · Pull Request #76 · FNLCR-DMAP/SPAC_Shiny

heaven-howard · 2026-03-31T19:25:03Z

Feature

Previously, users would have to re-compute plots after switching between features. Now, plots are saved within a cache. If a user wants to return to a specific visualization, it can be instantly retrieved from cache, and there will be no delay from re-rendering.

How the cache works

A new VisualizationCache class (utils/cache_manager.py) is instantiated once per Shiny session and stored in shared['cache']. It is an LRU (Least Recently Used) cache backed by an OrderedDict, with a default capacity of 50 entries.

Each visualization is keyed by three things:

(dataset_version, viz_name, normalized_params)

dataset_version is an integer counter that increments every time the user loads a new dataset. All previous cache entries become unreachable automatically on data change, and cache.invalidate() is called immediately to free memory.
viz_name is a string identifying the plot (e.g. 'boxplot_interactive', 'ripley_l').
normalized_params is a hashable, canonical representation of all the UI inputs that affect the plot output. The normalize_params() helper recursively converts dicts, lists, and sets into tuples so the key is stable. Multi-select inputs whose display order matters (features in boxplot, target cell labels in nearest neighbor) are stored as plain tuples to preserve user-selected ordering to maintain previous app behavior. Inputs where order is irrelevant (e.g. region label filters) use sorted tuples to avoid redundant cache entries.

In every visualization server function, the pattern is:

params = { ...all inputs that affect the plot... }

def compute():
    # expensive work here
    return fig, df

fig, df = cache.get_or_compute('viz_name', version, params, compute)

compute is only called on a cache miss. On a hit, the previously computed (fig, df) pair is returned immediately and promoted to most-recently-used position. When the cache exceeds 50 entries, the oldest is evicted.

Cache invalidation

The cache is invalidated automatically whenever the underlying dataset changes. This covers two cases:

Loading a new file — the user uploads a .h5ad or .pickle file on the data input tab
Applying a subset — the user filters the dataset by annotation and label on the data input tab

Both actions update shared['adata_main'], which triggers the update_parts reactive effect in data_input_server.py. That effect increments dataset_version and calls cache.invalidate() to clear all entries immediately.

Because dataset_version is part of every cache key, any entries computed against the old dataset are unreachable even before invalidate() runs — the version bump alone is sufficient to treat them as stale. The explicit invalidate() call is an optimization that reclaims memory right away rather than waiting for LRU eviction to cycle them out.

Matplotlib Figure Serialization

Static matplotlib figures are serialized to PNG bytes before caching via fig_to_png_bytes() (utils/plot_utils.py), and reconstructed via png_bytes_to_figure() when served. This prevents layout mutations (margins, label sizes) from accumulating across repeated Shiny renders of the same figure object.

Files changed

File	Change
`utils/cache_manager.py`	New — `VisualizationCache` and `normalize_params`
`utils/plot_utils.py`	New helpers: `fig_to_png_bytes`, `png_bytes_to_figure`
`server/data_input_server.py`	Increments `dataset_version` and calls `cache.invalidate()` on data load
`app.py`	Initializes `shared['cache']` and `shared['dataset_version']` per session
All visualization server modules	Refactored to use `cache.get_or_compute` pattern

Testing

When plots are re-queried, Docker logs do not show that a plot is being generated. Docker logs appear stable, with no noticeable bugs or errors while generating plots. A large (5M cells) test .pickle file was generated for testing. Boxplots that take roughly 10 seconds to compute and generate can be re-rendered instantly. (Demo)

heaven-howard added 5 commits March 26, 2026 15:04

Add per-session LRU visualization caching system

d020218

fix reactive dependency bug

4931e7b

Consolidate plot params and fix figure growth on re-render

5d65369

revert sorted tuple format for boxplot features

32b4ce6

revert sorted tuple format for nearest neighbor features

dcdc4e3

heaven-howard marked this pull request as ready for review March 31, 2026 19:52

Implement Cache Logs

008cddc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add per-session LRU visualization caching#76

Add per-session LRU visualization caching#76
heaven-howard wants to merge 6 commits intoFNLCR-DMAP:devfrom
Saran-Nag:plot-caching

heaven-howard commented Mar 31, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

heaven-howard commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Feature

How the cache works

Cache invalidation

Matplotlib Figure Serialization

Files changed

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

heaven-howard commented Mar 31, 2026 •

edited

Loading