Skip to content

fix(intelligence): label citation gains/regressions with the project's own domain#628

Merged
arberx merged 1 commit into
AINYC:mainfrom
evolv3ai:fix/intelligence-competitor-citation-misattribution
May 25, 2026
Merged

fix(intelligence): label citation gains/regressions with the project's own domain#628
arberx merged 1 commit into
AINYC:mainfrom
evolv3ai:fix/intelligence-competitor-citation-misattribution

Conversation

@evolv3ai
Copy link
Copy Markdown
Contributor

Summary

A project citation gain/regression is correctly gated on citationState === 'cited' (the project's own domain appearing in the sources) — that part is fine. But the citationUrl carried into the resulting insight was citedDomains[0]:

// intelligence-service.ts buildRunData (before)
citationUrl: domains[0] ?? undefined,   // domains = the FULL citedDomains set

citedDomains is the full co-cited set in provider order — project domains, tracked competitors, and third-party references intermingled. When a competitor or third-party sorts ahead of the project's own domain, the insight's recommendation.target points at the wrong site: a genuine regression on the project's page surfaces as audit winntile.com.

The detection is right; only the labeled URL is wrong.

Fix

buildRunData now resolves Snapshot.citationUrl through a new helper:

// citation-utils.ts
export function pickProjectCitedDomain(
  citedDomains: readonly string[],
  projectDomains: string[],
): string | undefined

It returns the first cited domain that matches the project's canonical or owned domains (reusing the existing domainMatches semantics), or undefined when the citation was established via a grounding-source match with no project domain in citedDomains — better an empty target than a competitor's. Project domains are loaded once per run in buildRunData via effectiveDomains.

The transition gate (cited: r.citationState === CitationStates.cited) is untouched.

Tests

Three new intelligence-service.test.ts cases:

  • regression with citedDomains: ['winntile.com', 'example.com']previousCitationUrl and the persisted recommendation.target are example.com, not the competitor sorted first.
  • gain with the same shape → citationUrl is example.com.
  • grounding-only (citationState: 'cited', citedDomains: ['winntile.com'], no project domain) → previousCitationUrl is undefined.

All existing intelligence-service tests use citedDomains: ['example.com'] (project first), so they're unaffected.

Test plan

  • pnpm --filter @ainyc/canonry run typecheck
  • npx vitest run packages/canonry/test/intelligence-service.test.ts — 31 passed (28 existing + 3 new)
  • Full pnpm run test — only 2 unrelated pre-existing failures remain (packages/db/test/schedules-migration.test.ts, which shells out to a sqlite3 CLI binary absent from the sandbox; verified identical on clean main)
  • eslint clean on both changed source files (0 new warnings)
  • Version bump 4.55.2 → 4.55.3 (root + @ainyc/canonry)

Context

Found while triaging a real project whose dashboard kept surfacing audit <competitor>.com recommendations for its own citation losses. Companion to #399 (the short-displayName matcher fix) — same investigation, different subsystem.

🤖 Generated with Claude Code

…s own domain

A project citation gain/regression is correctly gated on
`citationState === 'cited'` (project-owned domain in the sources), but
the `citationUrl` carried into the insight was `citedDomains[0]` — the
first entry of the FULL co-cited set, in provider order. When a
competitor or third-party sorts ahead of the project's own domain, the
insight's `recommendation.target` points at the wrong site, e.g. a real
regression on the project's page surfaced as "audit winntile.com".

Fix: `buildRunData` now resolves `Snapshot.citationUrl` via a new
`pickProjectCitedDomain(citedDomains, projectDomains)` helper, which
returns the first cited domain that matches the project's canonical or
owned domains, or `undefined` when the citation was established via a
grounding-source match with no project domain in `citedDomains` (better
an empty target than a competitor's). Project domains are loaded once per
run in `buildRunData` via `effectiveDomains`.

The detection gate is unchanged — this only corrects the labeled URL.

Tests: three new intelligence-service cases — regression and gain label
the project domain (not a co-cited competitor sorted first), and a
grounding-only citation leaves the URL undefined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@arberx arberx merged commit 46f0b91 into AINYC:main May 25, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants