perf: bound latest-signal queries to a lookback window#282
Merged
Conversation
getLatestQuery and getLastSeenQuery filtered only on subject, forcing ClickHouse to scan every historical partition for a vehicle (10-20M rows per call on prod-scale data). The signal table is ORDER BY (subject, timestamp, name) PARTITION BY toYYYYMM(timestamp), so an upper bound on timestamp enables partition pruning and primary-key range reads. Adds LATEST_SIGNALS_LOOKBACK_DAYS (default 45 in the sample config, 0 disables the filter). Drops the unused DEVICE_LAST_SEEN_BIN_HOURS field which was previously threaded into the Service but never read.
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
getLatestQuery/getLastSeenQueryfiltered only onsubject. Withsignalordered by(subject, timestamp, name)and partitioned by month, missing timestamp bounds meant every historical partition was scanned per call (10–20M avg read rows in prod).LATEST_SIGNALS_LOOKBACK_DAYS(default45insettings.sample.yaml,0disables) and plumbs the lower bound into both queries.DEVICE_LAST_SEEN_BIN_HOURS/lastSeenBucketHrsfield that was threaded into the Service but never read.Why
Top-10 slowest prod queries were all latest-signal fetches, each reading 10–21M rows. With the timestamp bound, partition pruning kicks in and most active vehicles drop to three partitions — expect a multi-x reduction in
Avg. Read rowsandAvg. Mem Usage.Test plan
make lintcleanmake testclean (excluding docker-gated CH integration tests)Avg. Read rowson the same queries in the query stats page vs. the screenshot we started from.LATEST_SIGNALS_LOOKBACK_DAYSdown if a tighter window is safe.🤖 Generated with Claude Code