Skip to content

Altor-lab/altor-vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

altor-vec

Client-side vector search. Rust + WASM. 54KB. Sub-millisecond.

npm version npm downloads CI GitHub stars License WASM size

Try Live Demo


Zero server. Zero API keys. Zero per-query cost. Your users' data never leaves their browser.

altor-vec is an HNSW vector similarity search engine written in Rust that compiles to 54KB of WebAssembly. Search 10,000 vectors in under 1ms — entirely client-side.

Why altor-vec?

You're paying Algolia $0.50 per 1,000 searches and sending your users' queries to a third party.

With altor-vec, search runs in the browser. $0 per query. Forever.

altor-vec Algolia Voy Orama
Runs client-side Yes No Yes Yes
Binary size 54KB gz N/A 75KB gz ~2KB*
Algorithm HNSW BM25 k-d tree Brute-force
p95 latency 0.6ms ~50ms (network) ~2ms ~5ms
Per-query cost $0 $0.50/1K $0 Free tier

*Orama's 2KB is keyword search only; vector search adds significant size.

Get started in 30 seconds

npm install altor-vec
import init, { WasmSearchEngine } from 'altor-vec';

await init();

// Load a pre-built index
const resp = await fetch('/index.bin');
const engine = new WasmSearchEngine(new Uint8Array(await resp.arrayBuffer()));

// Search — returns in <1ms
const results = JSON.parse(engine.search(queryEmbedding, 5));
// => [[nodeId, distance], ...]

That's it. No server to deploy. No API key to manage. No billing to worry about.

Benchmarks

Latency (10K vectors, 384d)

Environment p95
Chrome 0.60ms
Node.js 0.50ms
Native Rust 0.26ms

Size

Asset Size
.wasm gzipped 54KB
.wasm raw 117KB
Index (10K/384d) 17MB

Use with a Web Worker (recommended for production)

Keep the main thread free — especially important on mobile:

// worker.js
import init, { WasmSearchEngine } from 'altor-vec';

let engine;
self.onmessage = async (e) => {
  if (e.data.type === 'init') {
    await init();
    const resp = await fetch(e.data.indexUrl);
    engine = new WasmSearchEngine(new Uint8Array(await resp.arrayBuffer()));
    postMessage({ type: 'ready', count: engine.len() });
  }
  if (e.data.type === 'search') {
    const results = JSON.parse(engine.search(new Float32Array(e.data.query), e.data.topK));
    postMessage({ type: 'results', results });
  }
};
// main.js — UI stays buttery smooth
const worker = new Worker('worker.js', { type: 'module' });
worker.postMessage({ type: 'init', indexUrl: '/index.bin' });

API

Method Description
new WasmSearchEngine(bytes) Load a serialized index
.from_vectors(flat, dims, m, ef_construction, ef_search) Build index from vectors
.search(query, topK) Search → JSON [[id, dist], ...]
.add_vectors(flat, dims) Add vectors to existing index
.to_bytes() Serialize index
.len() Vector count
.free() Free WASM memory

Parameters:

Param Default What it does
m 16 Connections per node. Higher = better recall, more RAM
ef_construction 200 Build-time beam width. Higher = better index, slower build
ef_search 50 Search-time beam width. Higher = better recall, slower search

Works with any embedding model

Model Dims Where it runs
all-MiniLM-L6-v2 384 Browser (Transformers.js)
text-embedding-3-small 1536 OpenAI API
embed-english-v3 1024 Cohere API

Fully client-side with Transformers.js — no API calls at all:

import { pipeline } from '@huggingface/transformers';

const embed = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const output = await embed('your query', { pooling: 'mean', normalize: true });
const results = JSON.parse(engine.search(new Float32Array(output.data), 5));

How it works

altor-vec uses HNSW (Hierarchical Navigable Small World) — the same algorithm behind Pinecone, Qdrant, and pgvector. HNSW builds a multi-layer graph where each node is a vector and edges connect nearby neighbors. Upper layers act as express lanes for coarse navigation; the bottom layer contains all vectors for fine-grained search. A query enters at the top and greedily descends to find the nearest neighbors in O(log n) time.

All vectors are L2-normalized at insert time, so dot product distance equals cosine similarity — no extra computation at search time.

Architecture

src/
├── lib.rs              # Public API re-exports
├── distance.rs         # Dot product, normalization (auto-vectorizes with SIMD)
└── hnsw/
    ├── mod.rs           # HnswIndex: API + serialization
    ├── graph.rs         # Layered graph structure
    ├── search.rs        # Greedy beam search
    └── construction.rs  # HNSW insert + random layer selection

wasm/
└── src/lib.rs          # WasmSearchEngine (wasm-bindgen wrapper)

Build from source

cargo test                # run tests
cargo bench               # run benchmarks
cd wasm && wasm-pack build --target web --release  # build WASM

Contributing

We welcome contributions! See CONTRIBUTING.md for build instructions, code style, and PR process.

License

MIT


Built by altor-lab
npm · Issues · Contact


Need managed semantic search? Embedding pipeline, index building, CDN delivery?
anshul@altorlab.dev

About

Client-side vector search powered by HNSW. 54KB gzipped WASM. Sub-millisecond latency.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors