Skip to content

Hebbian weight vs. query-local relevance tradeoff (collapse v2) #23

@ClaudioDrews

Description

@ClaudioDrews

Summary

The collapse module's Hebbian cross-source corroboration (amplify_gain) was applied as a blind multiplier — facts that appeared in multiple sources received a flat 15% boost per corroborating source, regardless of how relevant they were to the current query. This rewarded globally-important structural facts (e.g., "Honcho abandoned as memory platform") over query-locally-relevant facts (e.g., a domain-specific fact about Qdrant usage) when the query was about Qdrant configuration.

Root cause

In score_all() (icarus/collapse.py:162):

boost = min(corro * amplify_gain, amplify_cap)

amplify_gain (0.15) was applied without interaction with the candidate's base salience. A fact with Corro:1 and base 0.640 received the same +15% boost as a fact with Corro:1 and base 0.900 — but the lower-base fact is less query-relevant and shouldn't receive full amplification.

Fix (committed in #24)

boost = min(corro * amplify_gain * bases[i], amplify_cap)

The base salience — which already encodes query-token overlap — attenuates the Hebbian boost. A globally-important fact with low query relevance receives proportionally less amplification.

Effect on real data (query: "como configurar o Qdrant para memory-os")

Candidate Base Corro Boost (v1) Boost (v2) Salience (v2)
Memory OS origin (fabric) 0.521 3 0.450 0.235 0.644
Honcho abandoned (facts) 0.556 1 0.150 0.083 0.603
Qdrant usage fact (facts) 0.546 0 0.000 0.000 0.546

Result: the query-relevant Qdrant fact now survives (was pruned in v1). Honcho still survives but dropped from rank 3 to rank 5.

Tuning

ICARUS_COLLAPSE_AMPLIFY_GAIN (env var, default 0.15) is the knob for this tradeoff:

  • Current (0.15): with attenuation, effective boost is ~0.08 per corroboration for average-base (~0.55) candidates, ~0.12 for high-base (~0.80) candidates.
  • If corroboration feels undervalued in production: raise to 0.20. This restores effective boosts to roughly pre-attenuation levels (~0.11 per corroboration at average base, ~0.16 at high base).
  • If structural facts still crowd out query-relevant ones: lower to 0.10 or raise ICARUS_COLLAPSE_OVERLAP_WEIGHT (query overlap weight, default 0.55).

Monitor via ICARUS_COLLAPSE_DEBUG=1 — logs the full salience-ranked pool with base, corroboration, and final salience for every candidate.

Related

  • Collapse module: icarus/collapse.py
  • Integration: icarus/hooks.py_apply_collapse()
  • Tests: _test_collapse.py (all passing with v2)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions