TQ+ calibration: silent identity-freeze on small first add; fit from a cumulative warm-up sample

## Summary

TQ+ per-coordinate calibration is fitted from the **first add batch only** and then frozen for the life of the index. If that first batch has fewer than `TQPLUS_MIN_SAMPLES` (1000) vectors, the fit falls back to **identity** (`shift=0, scale=1`) — but that identity is stored as non-empty and frozen, so every subsequent add (even millions of vectors) silently reuses identity and never fits real TQ+. The index permanently loses the TQ+ recall gain, with no error or warning.

The empty-first-add case was already fixed (the `n == 0` early-return in `add`); the 1–999 case was not.

## Repro

```python
idx = TurboQuantIndex(dim=128, bit_width=4)
idx.add(seed_vectors)          # e.g. 500 vectors  -> identity calibration, frozen
idx.add(one_million_vectors)   # never re-fits; whole index runs on identity
```

## Root cause

- `encode::compute_tqplus_calibration` returns identity when `n < TQPLUS_MIN_SAMPLES` (`turbovec/src/encode.rs:149`).
- `TurboQuantIndex::add` fits + freezes calibration on the first add only (`turbovec/src/lib.rs:298-307`); subsequent adds reuse the frozen value.
- The index discards the original vectors after encoding (keeps only quantized codes + scales), and every encoded vector must share **one** frozen calibration (one coordinate system). So calibration cannot be re-fitted "late" — the already-encoded vectors can't be re-encoded.

## Why the trivial fix is insufficient

The minimal fix ("don't freeze identity; fit on the first batch that is >= 1000") handles "small seed then big load" but still fails for drip-fed small batches (every add < 1000 -> never calibrates), and only ever reflects one batch's distribution.

## Proper fix: dual-mode warm-up

Because originals are discarded and a single coordinate system is required, doing this properly means a dual-mode index:

- **Warm-up (below the sample threshold):** buffer the raw vectors, don't quantize yet, serve search by exact brute-force over that small buffer (<= threshold vectors — cheap, and higher quality than quantized at small scale).
- **At the threshold:** fit calibration from the full warm-up set, encode the whole buffer at once, freeze, drop the raw buffer.
- **Steady state:** stream-quantize with the frozen calibration, as today.

This eliminates the silent identity-freeze and calibrates from a proper cumulative sample, while keeping vectors searchable throughout.

### Open design decisions
1. Search during warm-up: brute-force exact over the buffer (recommended) vs return nothing until warmed up.
2. Persistence: persist the raw buffer + mode (new `.tv`/`.tvim` format version) vs force a flush-with-current-data on save.
3. Keep `TQPLUS_MIN_SAMPLES = 1000`?
4. Memory: buffer costs up to `threshold * dim * 4` bytes raw (~6 MB at dim=1536, threshold 1000).

## Prior art

Qdrant's TurboQuant implementation makes the same base-vs-calibration split and solves the sampling with a **streaming** estimator (P²/P-Square over a Vitter's Algorithm R reservoir), because it quantizes during a build/optimize phase when vectors are present. turbovec is pure online ingest, hence the warm-up-buffer approach above. See https://qdrant.tech/articles/turboquant-quantization/

## Positioning note (separate)

The base TurboQuant is genuinely data-oblivious / training-free; TQ+ is a lightweight data-dependent calibration. The "no training" claim is fine when scoped to the base algorithm (as Qdrant scopes it), but README/docs should be precise that TQ+ does a data-dependent calibration step. Worth a docs precision pass, tracked separately from the code change.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TQ+ calibration: silent identity-freeze on small first add; fit from a cumulative warm-up sample #107

Summary

Repro

Root cause

Why the trivial fix is insufficient

Proper fix: dual-mode warm-up

Open design decisions

Prior art

Positioning note (separate)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

TQ+ calibration: silent identity-freeze on small first add; fit from a cumulative warm-up sample #107

Description

Summary

Repro

Root cause

Why the trivial fix is insufficient

Proper fix: dual-mode warm-up

Open design decisions

Prior art

Positioning note (separate)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions