Hebbian weight vs. query-local relevance tradeoff (collapse v2)

## Summary

The collapse module's Hebbian cross-source corroboration (`amplify_gain`) was applied as a blind multiplier — facts that appeared in multiple sources received a flat 15% boost per corroborating source, regardless of how relevant they were to the current query. This rewarded **globally-important structural facts** (e.g., "Honcho abandoned as memory platform") over **query-locally-relevant facts** (e.g., a domain-specific fact about Qdrant usage) when the query was about Qdrant configuration.

## Root cause

In `score_all()` (`icarus/collapse.py:162`):

```python
boost = min(corro * amplify_gain, amplify_cap)
```

`amplify_gain` (0.15) was applied without interaction with the candidate's base salience. A fact with Corro:1 and base 0.640 received the same +15% boost as a fact with Corro:1 and base 0.900 — but the lower-base fact is less query-relevant and shouldn't receive full amplification.

## Fix (committed in #24)

```python
boost = min(corro * amplify_gain * bases[i], amplify_cap)
```

The base salience — which already encodes query-token overlap — attenuates the Hebbian boost. A globally-important fact with low query relevance receives proportionally less amplification.

## Effect on real data (query: "como configurar o Qdrant para memory-os")

| Candidate | Base | Corro | Boost (v1) | Boost (v2) | Salience (v2) |
|---|---|---|---|---|---|
| Memory OS origin (fabric) | 0.521 | 3 | 0.450 | 0.235 | 0.644 |
| Honcho abandoned (facts) | 0.556 | 1 | 0.150 | 0.083 | 0.603 |
| Qdrant usage fact (facts) | 0.546 | 0 | 0.000 | 0.000 | 0.546 |

Result: the query-relevant Qdrant fact now survives (was pruned in v1). Honcho still survives but dropped from rank 3 to rank 5.

## Tuning

**`ICARUS_COLLAPSE_AMPLIFY_GAIN`** (env var, default 0.15) is the knob for this tradeoff:

- **Current (0.15)**: with attenuation, effective boost is ~0.08 per corroboration for average-base (~0.55) candidates, ~0.12 for high-base (~0.80) candidates.
- **If corroboration feels undervalued in production**: raise to 0.20. This restores effective boosts to roughly pre-attenuation levels (~0.11 per corroboration at average base, ~0.16 at high base).
- **If structural facts still crowd out query-relevant ones**: lower to 0.10 or raise `ICARUS_COLLAPSE_OVERLAP_WEIGHT` (query overlap weight, default 0.55).

Monitor via `ICARUS_COLLAPSE_DEBUG=1` — logs the full salience-ranked pool with base, corroboration, and final salience for every candidate.

## Related

- Collapse module: `icarus/collapse.py`
- Integration: `icarus/hooks.py` → `_apply_collapse()`
- Tests: `_test_collapse.py` (all passing with v2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hebbian weight vs. query-local relevance tradeoff (collapse v2) #23

Summary

Root cause

Fix (committed in #24)

Effect on real data (query: "como configurar o Qdrant para memory-os")

Tuning

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Candidate	Base	Corro	Boost (v1)	Boost (v2)	Salience (v2)
Memory OS origin (fabric)	0.521	3	0.450	0.235	0.644
Honcho abandoned (facts)	0.556	1	0.150	0.083	0.603
Qdrant usage fact (facts)	0.546	0	0.000	0.000	0.546

Hebbian weight vs. query-local relevance tradeoff (collapse v2) #23

Description

Summary

Root cause

Fix (committed in #24)

Effect on real data (query: "como configurar o Qdrant para memory-os")

Tuning

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions