include the pages activeSdk in algolia search results#3409
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Additively register branch/record_batch/sdk for faceting (required by the active-SDK optionalFilters boost and the stale-record cleanup; self-sufficient on a fresh index). Enforce a ranking order with attribute/exact above proximity so title/heading matches outrank body-content matches — the fix for buried guides/quickstarts. The indexer is now the source of truth for these settings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…+ curated) Acronym⇄expansion pairs are auto-derived from the _tooltips glossary (self-syncing as tooltips are added); a curated list covers product-rename/phrasing synonyms (magic link→email link, login→sign in, i18n→localization, etc.). Enforced every run via saveSynonyms(replaceExistingSynonyms), same model as ranking. Fixes broken queries: 'magic link'/'i18n'/'DKIM' went from garbage/zero results to relevant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Note that the indexer (scripts/update-algolia-records.ts) is the source of truth for the search index's faceting, ranking, and synonyms, and that tuning them in the Algolia dashboard gets reverted on the next index run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Beyond the per-variant What's in this PR specifically:
Paired with the client-side active-SDK boost in clerk/clerk#2661. The full data-backed writeup — why boost |
| await algolia.setSettings({ | ||
| indexName: ALGOLIA_INDEX_NAME, | ||
| indexSettings: { | ||
| attributesForFaceting: [ | ||
| ...attributesForFaceting, | ||
| ...missingFacets.map((attribute) => `filterOnly(${attribute})`), | ||
| ], | ||
| ranking, | ||
| }, | ||
| }) |
There was a problem hiding this comment.
| await algolia.setSettings({ | |
| indexName: ALGOLIA_INDEX_NAME, | |
| indexSettings: { | |
| attributesForFaceting: [ | |
| ...attributesForFaceting, | |
| ...missingFacets.map((attribute) => `filterOnly(${attribute})`), | |
| ], | |
| ranking, | |
| }, | |
| }) | |
| await algolia.setSettings({ | |
| indexName: ALGOLIA_INDEX_NAME, | |
| indexSettings: { | |
| attributesForFaceting: [ | |
| ...attributesForFaceting, | |
| ...missingFacets.map((attribute) => `filterOnly(${attribute})`), | |
| ], | |
| ranking, | |
| }, | |
| forwardToReplicas: true, | |
| }) |
@manovotny should the setSettings also forward the change to replicas. Granted I don't think we are using replicas, but should be consistent I think.
There was a problem hiding this comment.
Good Q — went back and forth on this, landed on: no, setSettings shouldn't forward.
We're codifying these settings (declare + overwrite every run), which is itself the mechanism for keeping indexes consistent. forwardToReplicas is the other model — "tune the primary, propagate it" — so under codification it's redundant: the script configures whatever index it targets directly.
It's also the riskier default here specifically because this setSettings bundles ranking/customRanking. A standard replica usually exists to hold a different sort, and forwardToReplicas: true would overwrite that on every index run. So if we ever add replicas, the right move is to declare them in the script (which can express per-replica settings), not blanket-forward.
The synonyms call does forward, and that's deliberate — synonyms should always be identical across replicas, so it's a safe no-op today and the correct behavior if a replica ever shows up. Added a comment in efe7c94 spelling out the asymmetry so it doesn't read like an oversight.
Two notes: there are no replicas on dev_docs/prod_docs today, so this is all forward-looking; and heads up the suggestion is anchored to the pre-refactor block (the read/merge/missingFacets code is gone as of d7d3955 / 9d61ccc), so it can't be committed directly.
Overwrite attributesForFaceting and ranking on every index run instead of reading current settings, diffing, and additively merging facets. The indexer is the source of truth for these settings, so dashboard edits should revert on the next run -- but the additive facet merge was preserving the very drift it was meant to overwrite (ranking already overwrote; faceting now matches). setSettings is a top-level partial merge, so we declare only the keys we own and leave customRanking/searchableAttributes untouched. availableSDKs is intentionally not faceted: the client only retrieves it to render per-result SDK icons (Search.tsx SDKsIcon), never filters or counts on it, and retrieval is independent of faceting. branch/record_batch are filterOnly -- the stale-record cleanup uses facetFilters, never facet counts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Follow-up: the indexer now codifies the docs index's relevance settings — declares them and overwrites on every run. d7d3955, 9d61ccc Acting on review feedback: instead of read current settings → diff → additively merge each run, the indexer now declares the settings it owns and overwrites them every run, so any dashboard edit reverts on the next reindex. The additive facet merge actually worked against the goal — it preserved whatever was set in the dashboard rather than overwriting it, so indexes could (and did) drift apart. const searchableAttributes = ['unordered(hierarchy.lvl0)', /* …lvl1–6 */ 'content', 'unordered(keywords)']
const attributesForFaceting = ['branch', 'record_batch', 'sdk'].map((a) => `filterOnly(${a})`)
const ranking = ['typo', 'geo', 'words', 'filters', 'attribute', 'exact', 'proximity', 'custom']
const customRanking = ['desc(weight.pageRank)', 'desc(weight.level)', 'asc(weight.position)']
await algolia.setSettings({ indexName: ALGOLIA_INDEX_NAME, indexSettings: { searchableAttributes, attributesForFaceting, ranking, customRanking } })Why these four and not the whole settings object (i.e. we don't codify defaults — correct): The two added in
Why Effect on the next index run: adds A "warn if the dashboard was edited" guard was considered and deferred: since the indexer rewrites these every run, drift self-heals on the next index, so an explicit alert is more than this needs right now. |
Extend the declared settings to the remaining relevance levers so the indexer fully owns them and indexes can't drift apart: - searchableAttributes: the corpus + attribute priority the `attribute` ranking criterion (promoted above proximity) rides on. Identical across dev/prod today, so this just locks in the value the ranking work was tested against. - customRanking: the final tiebreaker, normalized to the weights the indexer actually writes (pageRank/level/position). Drops the dead desc(weight.popularity) entry lingering on some indexes -- it's never written to records, so it ranks nothing. Still a scoped declaration, not a full settings snapshot: setSettings is a partial merge, so Algolia's server-managed defaults stay untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nyms do The asymmetry read like an oversight in review. It's deliberate: synonyms must be identical across replicas, but settings bundle ranking/customRanking, which a standard replica may override for an alternate sort -- forwarding would clobber it. Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
With adding the settings update to the script it creates a bit of split understanding of the script. Before the settings updates was added, the script would entirely scope itself to the runners current branch, including the branch name in the records that where being added, allowing the search client to scope itself to that branch and know that its only getting that set of results. But when updating the index settings its across all records in that index, not just the records that are scoped to the branch that is being updated. IMO this introduces a split in how you'd expect the script to run, either not caring about the git branch and just having a single 'dev_docs' index, or that changes made only effect the search items being updated and would expect settings would only change for that branch too. This stems from a limitation of algolia ultimately and a desire on my part to not spin up a whole new index on every search development branch. So if this is a trade-off we are happy with then I'm happy to move ahead, but I haven't seen this addresses so just want to get your opinion. |
…s split Nick flagged that the script is branch-scoped for records but index-wide for settings, which reads as a split. It's inherent (Algolia has no per-branch settings) and accepted: shared index + branch-scoped records is the cost- conscious middle ground; isolated settings experiments use a personal index. - Script: scope note in the settings block explaining the split + the escape hatch. - AGENTS.md: new "Search index (Algolia)" section capturing the indexer model — indexes, branch-scoping, codified settings (4 + synonyms, declare-and-overwrite), filterOnly faceting + the availableSDKs/retrieval gotcha, forwardToReplicas asymmetry, ranking↔searchableAttributes coupling, and local testing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Totally fair catch — it's the sharpest point on the thread, and you're right that it was undocumented. The way I've come to think about it: records and settings are genuinely different kinds of thing, so the two scopes are each correct rather than inconsistent. Records are data — rightly per-branch (the Codifying also shrinks the downside you're pointing at. Before, a run could push arbitrary hand-tuned settings index-wide; now every run converges the index to the same declared values, and any change to them is a reviewed diff. So a content branch's indexer run just re-asserts canonical settings — nothing leaks. The seam only bites a search-relevance branch (rare — like this one) that's actively changing settings, and the escape hatch already exists and is exactly what I used here: point Full per-branch isolation is the "pure" answer — if usage/cost weren't a factor I'd spin up an index per branch and call it done. But that's a lot of indexes (and money) for a rare need, so shared So — accepted trade-off, agreed. And to your "haven't seen this addressed": it is now, in f85cd12 — a scope note in the settings block, plus a new "Search index (Algolia)" section in |
🔎 Previews:
What does this solve? What changed?
Deadline
Other resources