Add per-session LRU visualization caching#76
Open
heaven-howard wants to merge 6 commits intoFNLCR-DMAP:devfrom
Open
Add per-session LRU visualization caching#76heaven-howard wants to merge 6 commits intoFNLCR-DMAP:devfrom
heaven-howard wants to merge 6 commits intoFNLCR-DMAP:devfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Feature
Previously, users would have to re-compute plots after switching between features. Now, plots are saved within a cache. If a user wants to return to a specific visualization, it can be instantly retrieved from cache, and there will be no delay from re-rendering.
How the cache works
A new
VisualizationCacheclass (utils/cache_manager.py) is instantiated once per Shiny session and stored inshared['cache']. It is an LRU (Least Recently Used) cache backed by anOrderedDict, with a default capacity of 50 entries.Each visualization is keyed by three things:
dataset_versionis an integer counter that increments every time the user loads a new dataset. All previous cache entries become unreachable automatically on data change, andcache.invalidate()is called immediately to free memory.viz_nameis a string identifying the plot (e.g.'boxplot_interactive','ripley_l').normalized_paramsis a hashable, canonical representation of all the UI inputs that affect the plot output. Thenormalize_params()helper recursively converts dicts, lists, and sets into tuples so the key is stable. Multi-select inputs whose display order matters (features in boxplot, target cell labels in nearest neighbor) are stored as plain tuples to preserve user-selected ordering to maintain previous app behavior. Inputs where order is irrelevant (e.g. region label filters) use sorted tuples to avoid redundant cache entries.In every visualization server function, the pattern is:
computeis only called on a cache miss. On a hit, the previously computed(fig, df)pair is returned immediately and promoted to most-recently-used position. When the cache exceeds 50 entries, the oldest is evicted.Cache invalidation
The cache is invalidated automatically whenever the underlying dataset changes. This covers two cases:
.h5ador.picklefile on the data input tabBoth actions update
shared['adata_main'], which triggers theupdate_partsreactive effect indata_input_server.py. That effect incrementsdataset_versionand callscache.invalidate()to clear all entries immediately.Because
dataset_versionis part of every cache key, any entries computed against the old dataset are unreachable even beforeinvalidate()runs — the version bump alone is sufficient to treat them as stale. The explicitinvalidate()call is an optimization that reclaims memory right away rather than waiting for LRU eviction to cycle them out.Matplotlib Figure Serialization
Static matplotlib figures are serialized to PNG bytes before caching via
fig_to_png_bytes()(utils/plot_utils.py), and reconstructed viapng_bytes_to_figure()when served. This prevents layout mutations (margins, label sizes) from accumulating across repeated Shiny renders of the same figure object.Files changed
utils/cache_manager.pyVisualizationCacheandnormalize_paramsutils/plot_utils.pyfig_to_png_bytes,png_bytes_to_figureserver/data_input_server.pydataset_versionand callscache.invalidate()on data loadapp.pyshared['cache']andshared['dataset_version']per sessioncache.get_or_computepatternTesting
When plots are re-queried, Docker logs do not show that a plot is being generated. Docker logs appear stable, with no noticeable bugs or errors while generating plots. A large (5M cells) test .pickle file was generated for testing. Boxplots that take roughly 10 seconds to compute and generate can be re-rendered instantly. (Demo)