Summary
The collapse module's Hebbian cross-source corroboration (amplify_gain) was applied as a blind multiplier — facts that appeared in multiple sources received a flat 15% boost per corroborating source, regardless of how relevant they were to the current query. This rewarded globally-important structural facts (e.g., "Honcho abandoned as memory platform") over query-locally-relevant facts (e.g., a domain-specific fact about Qdrant usage) when the query was about Qdrant configuration.
Root cause
In score_all() (icarus/collapse.py:162):
boost = min(corro * amplify_gain, amplify_cap)
amplify_gain (0.15) was applied without interaction with the candidate's base salience. A fact with Corro:1 and base 0.640 received the same +15% boost as a fact with Corro:1 and base 0.900 — but the lower-base fact is less query-relevant and shouldn't receive full amplification.
Fix (committed in #24)
boost = min(corro * amplify_gain * bases[i], amplify_cap)
The base salience — which already encodes query-token overlap — attenuates the Hebbian boost. A globally-important fact with low query relevance receives proportionally less amplification.
Effect on real data (query: "como configurar o Qdrant para memory-os")
| Candidate |
Base |
Corro |
Boost (v1) |
Boost (v2) |
Salience (v2) |
| Memory OS origin (fabric) |
0.521 |
3 |
0.450 |
0.235 |
0.644 |
| Honcho abandoned (facts) |
0.556 |
1 |
0.150 |
0.083 |
0.603 |
| Qdrant usage fact (facts) |
0.546 |
0 |
0.000 |
0.000 |
0.546 |
Result: the query-relevant Qdrant fact now survives (was pruned in v1). Honcho still survives but dropped from rank 3 to rank 5.
Tuning
ICARUS_COLLAPSE_AMPLIFY_GAIN (env var, default 0.15) is the knob for this tradeoff:
- Current (0.15): with attenuation, effective boost is ~0.08 per corroboration for average-base (~0.55) candidates, ~0.12 for high-base (~0.80) candidates.
- If corroboration feels undervalued in production: raise to 0.20. This restores effective boosts to roughly pre-attenuation levels (~0.11 per corroboration at average base, ~0.16 at high base).
- If structural facts still crowd out query-relevant ones: lower to 0.10 or raise
ICARUS_COLLAPSE_OVERLAP_WEIGHT (query overlap weight, default 0.55).
Monitor via ICARUS_COLLAPSE_DEBUG=1 — logs the full salience-ranked pool with base, corroboration, and final salience for every candidate.
Related
- Collapse module:
icarus/collapse.py
- Integration:
icarus/hooks.py → _apply_collapse()
- Tests:
_test_collapse.py (all passing with v2)
Summary
The collapse module's Hebbian cross-source corroboration (
amplify_gain) was applied as a blind multiplier — facts that appeared in multiple sources received a flat 15% boost per corroborating source, regardless of how relevant they were to the current query. This rewarded globally-important structural facts (e.g., "Honcho abandoned as memory platform") over query-locally-relevant facts (e.g., a domain-specific fact about Qdrant usage) when the query was about Qdrant configuration.Root cause
In
score_all()(icarus/collapse.py:162):amplify_gain(0.15) was applied without interaction with the candidate's base salience. A fact with Corro:1 and base 0.640 received the same +15% boost as a fact with Corro:1 and base 0.900 — but the lower-base fact is less query-relevant and shouldn't receive full amplification.Fix (committed in #24)
The base salience — which already encodes query-token overlap — attenuates the Hebbian boost. A globally-important fact with low query relevance receives proportionally less amplification.
Effect on real data (query: "como configurar o Qdrant para memory-os")
Result: the query-relevant Qdrant fact now survives (was pruned in v1). Honcho still survives but dropped from rank 3 to rank 5.
Tuning
ICARUS_COLLAPSE_AMPLIFY_GAIN(env var, default 0.15) is the knob for this tradeoff:ICARUS_COLLAPSE_OVERLAP_WEIGHT(query overlap weight, default 0.55).Monitor via
ICARUS_COLLAPSE_DEBUG=1— logs the full salience-ranked pool with base, corroboration, and final salience for every candidate.Related
icarus/collapse.pyicarus/hooks.py→_apply_collapse()_test_collapse.py(all passing with v2)