include the pages activeSdk in algolia search results by NWylynko · Pull Request #3409 · clerk/clerk-docs

NWylynko · 2026-05-29T21:56:14Z

🔎 Previews:

https://clerk-git-nick-boost-active-sdk-search-results.clerkstage.dev/docs/pr/nick-include-guides-sdk-in-search

What does this solve? What changed?

Adds the guides active sdk to the algolia search results, allowing search to boost search results that match the users active sdk

Deadline

Other resources

vercel · 2026-05-29T21:56:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
clerk-docs	Ready	Preview	Jun 2, 2026 7:55pm

Additively register branch/record_batch/sdk for faceting (required by the active-SDK optionalFilters boost and the stale-record cleanup; self-sufficient on a fresh index). Enforce a ranking order with attribute/exact above proximity so title/heading matches outrank body-content matches — the fix for buried guides/quickstarts. The indexer is now the source of truth for these settings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…+ curated) Acronym⇄expansion pairs are auto-derived from the _tooltips glossary (self-syncing as tooltips are added); a curated list covers product-rename/phrasing synonyms (magic link→email link, login→sign in, i18n→localization, etc.). Enforced every run via saveSynonyms(replaceExistingSynonyms), same model as ranking. Fixes broken queries: 'magic link'/'i18n'/'DKIM' went from garbage/zero results to relevant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Note that the indexer (scripts/update-algolia-records.ts) is the source of truth for the search index's faceting, ranking, and synonyms, and that tuning them in the Algolia dashboard gets reverted on the next index run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

manovotny · 2026-05-30T22:28:14Z

Beyond the per-variant sdk field, this branch now also codifies the docs index's faceting, ranking, and synonyms in scripts/update-algolia-records.ts, so they're enforced on every index run instead of living as mutable dashboard state.

What's in this PR specifically:

sdk added to attributesForFaceting (filterOnly) — what makes the optionalFilters: sdk:<active> boost on the client side work at all.
Ranking reorder enforced in the indexer — attribute/exact above proximity, so title/heading matches beat body-content matches.
Hybrid synonyms — acronyms auto-derived from the _tooltips glossary + a curated phrasing list (fixes magic link, i18n, DKIM, login, …).
AGENTS.md: don't tune ranking/synonyms/faceting in the Algolia dashboard — they're codified here and revert on every reindex.

Paired with the client-side active-SDK boost in clerk/clerk#2661. The full data-backed writeup — why boost sdk over availableSDKs, the ranking reorder, the synonym set, and every rejected alternative with measurements — lives there: https://github.com/clerk/clerk/pull/2661#issuecomment-4585030268

NWylynko · 2026-06-02T15:00:55Z

+    await algolia.setSettings({
+      indexName: ALGOLIA_INDEX_NAME,
+      indexSettings: {
+        attributesForFaceting: [
+          ...attributesForFaceting,
+          ...missingFacets.map((attribute) => `filterOnly(${attribute})`),
+        ],
+        ranking,
+      },
+    })


Suggested change

await algolia.setSettings({

indexName: ALGOLIA_INDEX_NAME,

indexSettings: {

attributesForFaceting: [

...attributesForFaceting,

...missingFacets.map((attribute) => `filterOnly(${attribute})`),

],

ranking,

},

})

await algolia.setSettings({

indexName: ALGOLIA_INDEX_NAME,

indexSettings: {

attributesForFaceting: [

...attributesForFaceting,

...missingFacets.map((attribute) => `filterOnly(${attribute})`),

],

ranking,

},

forwardToReplicas: true,

})

@manovotny should the setSettings also forward the change to replicas. Granted I don't think we are using replicas, but should be consistent I think.

Good Q — went back and forth on this, landed on: no, setSettings shouldn't forward.

We're codifying these settings (declare + overwrite every run), which is itself the mechanism for keeping indexes consistent. forwardToReplicas is the other model — "tune the primary, propagate it" — so under codification it's redundant: the script configures whatever index it targets directly.

It's also the riskier default here specifically because this setSettings bundles ranking/customRanking. A standard replica usually exists to hold a different sort, and forwardToReplicas: true would overwrite that on every index run. So if we ever add replicas, the right move is to declare them in the script (which can express per-replica settings), not blanket-forward.

The synonyms call does forward, and that's deliberate — synonyms should always be identical across replicas, so it's a safe no-op today and the correct behavior if a replica ever shows up. Added a comment in efe7c94 spelling out the asymmetry so it doesn't read like an oversight.

Two notes: there are no replicas on dev_docs/prod_docs today, so this is all forward-looking; and heads up the suggestion is anchored to the pre-refactor block (the read/merge/missingFacets code is gone as of d7d3955 / 9d61ccc), so it can't be committed directly.

Overwrite attributesForFaceting and ranking on every index run instead of reading current settings, diffing, and additively merging facets. The indexer is the source of truth for these settings, so dashboard edits should revert on the next run -- but the additive facet merge was preserving the very drift it was meant to overwrite (ranking already overwrote; faceting now matches). setSettings is a top-level partial merge, so we declare only the keys we own and leave customRanking/searchableAttributes untouched. availableSDKs is intentionally not faceted: the client only retrieves it to render per-result SDK icons (Search.tsx SDKsIcon), never filters or counts on it, and retrieval is independent of faceting. branch/record_batch are filterOnly -- the stale-record cleanup uses facetFilters, never facet counts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

manovotny · 2026-06-02T15:44:04Z

Follow-up: the indexer now codifies the docs index's relevance settings — declares them and overwrites on every run. d7d3955, 9d61ccc

Acting on review feedback: instead of read current settings → diff → additively merge each run, the indexer now declares the settings it owns and overwrites them every run, so any dashboard edit reverts on the next reindex. The additive facet merge actually worked against the goal — it preserved whatever was set in the dashboard rather than overwriting it, so indexes could (and did) drift apart.

const searchableAttributes = ['unordered(hierarchy.lvl0)', /* …lvl1–6 */ 'content', 'unordered(keywords)']
const attributesForFaceting = ['branch', 'record_batch', 'sdk'].map((a) => `filterOnly(${a})`)
const ranking = ['typo', 'geo', 'words', 'filters', 'attribute', 'exact', 'proximity', 'custom']
const customRanking = ['desc(weight.pageRank)', 'desc(weight.level)', 'asc(weight.position)']
await algolia.setSettings({ indexName: ALGOLIA_INDEX_NAME, indexSettings: { searchableAttributes, attributesForFaceting, ranking, customRanking } })

Why these four and not the whole settings object (i.e. we don't codify defaults — correct): setSettings is a top-level partial merge, so we declare only the relevance levers we deliberately own and leave everything else (typo tolerance, pagination, highlighting, and Algolia's server-managed defaults) untouched. Snapshotting the full object would freeze those defaults at snapshot time and silently clobber any future default change or intentional tweak — diff noise and maintenance for zero benefit. Synonyms stay generated from _tooltips (already a full replace).

The two added in 9d61ccc8f:

searchableAttributes — the corpus + attribute priority the attribute ranking criterion rides on (it's what makes a heading match beat a body match, so the reorder depends on it). Identical across dev/prod today, so this locks in the value the ranking work was tested against rather than freezing an unknown state.
customRanking — the final tiebreaker, normalized to the weights the indexer actually writes (pageRank/level/position). Drops the dead desc(weight.popularity) entry lingering on dev_docs — it's never written to records, so it ranked nothing.

Why availableSDKs is intentionally not faceted — it powers the per-result SDK icons, but the client only retrieves it (Search.tsx → SDKsIcon); it never filters or counts on it, and retrieval is independent of faceting. So the availableSDKs facet that existed on dev_docs was inert — dropping it doesn't touch the icons. branch/record_batch are filterOnly because the stale-record cleanup filters via facetFilters, never facet counts.

Effect on the next index run: adds sdk faceting everywhere, normalizes branch/record_batch to filterOnly, drops the stray availableSDKs facet and the dead customRanking entry on dev_docs, and locks searchableAttributes/ranking to the tested values — leaving prod/dev/test with an identical, code-defined config.

A "warn if the dashboard was edited" guard was considered and deferred: since the indexer rewrites these every run, drift self-heals on the next index, so an explicit alert is more than this needs right now.

Extend the declared settings to the remaining relevance levers so the indexer fully owns them and indexes can't drift apart: - searchableAttributes: the corpus + attribute priority the `attribute` ranking criterion (promoted above proximity) rides on. Identical across dev/prod today, so this just locks in the value the ranking work was tested against. - customRanking: the final tiebreaker, normalized to the weights the indexer actually writes (pageRank/level/position). Drops the dead desc(weight.popularity) entry lingering on some indexes -- it's never written to records, so it ranks nothing. Still a scoped declaration, not a full settings snapshot: setSettings is a partial merge, so Algolia's server-managed defaults stay untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nyms do The asymmetry read like an oversight in review. It's deliberate: synonyms must be identical across replicas, but settings bundle ranking/customRanking, which a standard replica may override for an alternate sort -- forwarding would clobber it. Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

NWylynko · 2026-06-02T18:43:13Z

@manovotny

With adding the settings update to the script it creates a bit of split understanding of the script. Before the settings updates was added, the script would entirely scope itself to the runners current branch, including the branch name in the records that where being added, allowing the search client to scope itself to that branch and know that its only getting that set of results.

But when updating the index settings its across all records in that index, not just the records that are scoped to the branch that is being updated. IMO this introduces a split in how you'd expect the script to run, either not caring about the git branch and just having a single 'dev_docs' index, or that changes made only effect the search items being updated and would expect settings would only change for that branch too.

This stems from a limitation of algolia ultimately and a desire on my part to not spin up a whole new index on every search development branch. So if this is a trade-off we are happy with then I'm happy to move ahead, but I haven't seen this addresses so just want to get your opinion.

…s split Nick flagged that the script is branch-scoped for records but index-wide for settings, which reads as a split. It's inherent (Algolia has no per-branch settings) and accepted: shared index + branch-scoped records is the cost- conscious middle ground; isolated settings experiments use a personal index. - Script: scope note in the settings block explaining the split + the escape hatch. - AGENTS.md: new "Search index (Algolia)" section capturing the indexer model — indexes, branch-scoping, codified settings (4 + synonyms, declare-and-overwrite), filterOnly faceting + the availableSDKs/retrieval gotcha, forwardToReplicas asymmetry, ranking↔searchableAttributes coupling, and local testing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

manovotny · 2026-06-02T19:53:54Z

Totally fair catch — it's the sharpest point on the thread, and you're right that it was undocumented.

The way I've come to think about it: records and settings are genuinely different kinds of thing, so the two scopes are each correct rather than inconsistent. Records are data — rightly per-branch (the branch: tag + client filter is what lets us share one index). Settings/synonyms are the index's relevance contract — a property of the index, not of a branch's content. Algolia has no per-branch settings, and conceptually there shouldn't be: "how this index ranks" isn't branch-specific.

Codifying also shrinks the downside you're pointing at. Before, a run could push arbitrary hand-tuned settings index-wide; now every run converges the index to the same declared values, and any change to them is a reviewed diff. So a content branch's indexer run just re-asserts canonical settings — nothing leaks. The seam only bites a search-relevance branch (rare — like this one) that's actively changing settings, and the escape hatch already exists and is exactly what I used here: point ALGOLIA_INDEX_NAME at a personal throwaway index. You pay for an extra index only while experimenting with settings, not per branch.

Full per-branch isolation is the "pure" answer — if usage/cost weren't a factor I'd spin up an index per branch and call it done. But that's a lot of indexes (and money) for a rare need, so shared dev_docs + branch-scoped records + a personal index for settings experiments is the middle ground I'd rather steward toward.

So — accepted trade-off, agreed. And to your "haven't seen this addressed": it is now, in f85cd12 — a scope note in the settings block, plus a new "Search index (Algolia)" section in AGENTS.md spelling out the split, the codified settings, and the personal-index escape hatch.

include the pages activeSdk in algolia search results

8082244

NWylynko requested a review from a team as a code owner May 29, 2026 21:56

NWylynko requested a review from manovotny May 29, 2026 22:04

manovotny and others added 3 commits May 30, 2026 16:22

vercel Bot deployed to Preview May 30, 2026 22:30 View deployment

manovotny approved these changes Jun 2, 2026

View reviewed changes

NWylynko commented Jun 2, 2026

View reviewed changes

vercel Bot deployed to Preview June 2, 2026 15:46 View deployment

vercel Bot deployed to Preview June 2, 2026 16:09 View deployment

vercel Bot deployed to Preview June 2, 2026 17:22 View deployment

vercel Bot deployed to Preview June 2, 2026 19:55 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

include the pages activeSdk in algolia search results#3409

include the pages activeSdk in algolia search results#3409
NWylynko wants to merge 8 commits into
mainfrom
nick/include-guides-sdk-in-search

NWylynko commented May 29, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 29, 2026 •

edited

Loading

Uh oh!

manovotny commented May 30, 2026

Uh oh!

NWylynko Jun 2, 2026

Uh oh!

manovotny Jun 2, 2026

Uh oh!

manovotny commented Jun 2, 2026 •

edited

Loading

Uh oh!

NWylynko commented Jun 2, 2026

Uh oh!

manovotny commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NWylynko commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔎 Previews:

What does this solve? What changed?

Deadline

Other resources

Uh oh!

vercel Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

manovotny commented May 30, 2026

Uh oh!

NWylynko Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

manovotny Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

manovotny commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NWylynko commented Jun 2, 2026

Uh oh!

manovotny commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NWylynko commented May 29, 2026 •

edited

Loading

vercel Bot commented May 29, 2026 •

edited

Loading

manovotny commented Jun 2, 2026 •

edited

Loading