Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/api-reference/rest/openapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1489,7 +1489,7 @@ paths:
operationId: CreateTableIndex
description: |
Create an index on a table column for faster search operations.
Supports vector indexes (IVF_FLAT, IVF_HNSW_SQ, IVF_PQ, etc.) and scalar indexes (BTREE, BITMAP, FTS, etc.).
Supports vector indexes (IVF_FLAT, IVF_HNSW_FLAT, IVF_HNSW_SQ, IVF_PQ, etc.) and scalar indexes (BTREE, BITMAP, FTS, etc.).
Index creation is handled asynchronously.
Use the `ListTableIndices` and `DescribeTableIndexStats` operations to monitor index creation progress.
requestBody:
Expand Down Expand Up @@ -2928,7 +2928,7 @@ components:
description: Name of the column to create index on
index_type:
type: string
description: Type of index to create (e.g., BTREE, BITMAP, LABEL_LIST, IVF_FLAT, IVF_PQ, IVF_HNSW_SQ, FTS)
description: Type of index to create (e.g., BTREE, BITMAP, LABEL_LIST, IVF_FLAT, IVF_HNSW_FLAT, IVF_PQ, IVF_HNSW_SQ, FTS)
name:
type: string
nullable: true
Expand Down
2 changes: 1 addition & 1 deletion docs/geneva/udfs/batch-udtfs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ class EdgeDetection:

### Index-based partitioning (`partition_by_indexed_column`)

Instead of partitioning by a materialized column, the framework reads partition assignments directly from an existing **IVF vector index** (IVF_FLAT, IVF_PQ, IVF_HNSW_SQ, etc.). This avoids materializing a `partition_id` column and keeps partitions synchronized with the index.
Instead of partitioning by a materialized column, the framework reads partition assignments directly from an existing **IVF vector index** (IVF_FLAT, IVF_PQ, IVF_HNSW_FLAT, IVF_HNSW_SQ, etc.). This avoids materializing a `partition_id` column and keeps partitions synchronized with the index.

```python
from geneva.partitioning import create_ivf_flat_index
Expand Down
2 changes: 1 addition & 1 deletion docs/indexing/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ LanceDB provides a comprehensive suite of indexing strategies for different data
| Index | Use Case | Description |
| :--------- | :------- | :---------- |
| `IVF` (Vector) | Large-scale vector search with configurable accuracy/speed trade-offs. Supports binary vectors with hamming distance. | Inverted File Index—a partition-based approximate nearest neighbor algorithm that groups similar vectors into partitions for efficient search.<br />Distance metrics: `l2` `cosine` `dot` `hamming`<br />Quantizations: `None/Flat` `PQ` `SQ` `RQ`|
| `IVF_HNSW` (Vector) | Large-scale vector search requiring both high recall and efficient partitioning. Combines the scalability of IVF with the search quality of HNSW. | Hybrid index combining IVF partitioning with HNSW graphs built within each partition. Provides improved search quality over pure IVF while maintaining scalability.<br />Distance metrics: `l2` `cosine` `dot`<br />Quantizations: `SQ`, `PQ`|
| `IVF_HNSW` (Vector) | Large-scale vector search requiring both high recall and efficient partitioning. Combines the scalability of IVF with the search quality of HNSW. | Hybrid index combining IVF partitioning with HNSW graphs built within each partition. Provides improved search quality over pure IVF while maintaining scalability.<br />Distance metrics: `l2` `cosine` `dot`<br />Quantizations: `None/Flat` `SQ` `PQ`|
| `FTS` (Full-text search) | String columns (e.g., title, description, content) requiring keyword-based search with BM25 ranking. | Full-text search index using BM25 ranking algorithm. Tokenizes text with configurable tokenization, stemming, stop word removal, and language-specific processing. |
| `BTree` (Scalar) | Numeric, temporal, and string columns with mostly distinct values. Best for highly selective queries on columns with many unique values. | Sorted index storing sorted copies of scalar columns with block headers in a btree cache. Header entries map to blocks of rows (4096 rows per block) for efficient disk reads. |
| `Bitmap` (Scalar) | Low-cardinality columns with few thousand or fewer distinct values. Accelerates equality and range filters. | Stores a bitmap for each distinct value in the column, with one bit per row indicating presence. Memory-efficient for low-cardinality data. |
Expand Down
21 changes: 13 additions & 8 deletions docs/indexing/vector-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ You can create and manage multiple vector indexes on any Lance dataset. LanceDB
<Info>
**IVF + HNSW**

In LanceDB, HNSW is not exposed as a top-level vector index. Instead, it's available as a sub-index inside IVF partitions. What this means in practice is that vectors are first partitioned by IVF, then each selected partition is searched using an HNSW graph (with quantization via `IVF_HNSW_PQ` / `IVF_HNSW_SQ`). This combines IVF's scalability with HNSW's higher-recall ANN search within partitions.
In LanceDB, HNSW is not exposed as a top-level vector index. Instead, it's available as a sub-index inside IVF partitions. What this means in practice is that vectors are first partitioned by IVF, then each selected partition is searched using an HNSW graph. LanceDB supports the unquantized variant `IVF_HNSW_FLAT`, along with quantized variants such as `IVF_HNSW_PQ` and `IVF_HNSW_SQ`. This combines IVF's scalability with HNSW's higher-recall ANN search within partitions.
</Info>

### Manual Indexing
Expand Down Expand Up @@ -54,12 +54,13 @@ Use this table as a quick starting point for choosing the right index type and q

| If your top priority is... | Use this index | Why | Typical compressed size vs. raw vectors |
| :--- | :--- | :--- | :--- |
| Highest recall / no quantization | `IVF_HNSW_FLAT` | Uses raw vectors inside the IVF+HNSW structure, avoiding quantization loss. | Around raw vector size plus HNSW graph overhead |
| Best recall/latency trade-off | `IVF_HNSW_SQ` | Combines IVF partitioning with HNSW graph search for strong quality at low latency. | Typically a little larger than `1/4` of raw size |
| Maximum compression | `IVF_RQ` | RaBitQ-style quantization with very strong compression. | Around `1/32` of raw size |
| Higher accuracy at small dimensions (`dimension <= 256`) | `IVF_PQ` | On small-dimensional vectors, `IVF_PQ` often provides higher accuracy with similar performance compared to `IVF_RQ`. | Usually `1/64` to `1/16` of raw size (depends on `num_sub_vectors`) |

<Warning>
If your vector search frequently includes metadata filters (`where(...)`), prefer `IVF_RQ` or `IVF_PQ`. In filtered workloads, `IVF_HNSW_SQ` latency can fluctuate significantly.
If your vector search frequently includes metadata filters (`where(...)`), prefer `IVF_RQ` or `IVF_PQ`. In filtered workloads, HNSW-backed IVF indexes such as `IVF_HNSW_FLAT` and `IVF_HNSW_SQ` can show higher latency variance.
</Warning>

Compression ratios are practical rules of thumb and can vary with vector distribution, metric, and configuration.
Expand All @@ -69,7 +70,7 @@ For small dimensions, choose `IVF_PQ` for accuracy, not for guaranteed higher co

Start with these values, then tune for your workload:

- `IVF_HNSW_SQ`
- HNSW-backed IVF indexes (`IVF_HNSW_FLAT`, `IVF_HNSW_SQ`, `IVF_HNSW_PQ`)
- `num_partitions`: start at `num_rows // 1,048,576` (rounded to an integer)
- Lower `num_partitions` can reduce search latency, but index build may become slower because partitions are larger.
- `ef_construction`: start at `150`; increase for better recall, decrease for faster indexing.
Expand All @@ -91,13 +92,14 @@ Make sure you have enough data in your table (at least a few thousand rows) for
Sometimes you need to configure the index beyond default parameters:

- Index Types:
- `IVF_HNSW_FLAT`: highest recall, with no vector quantization
- `IVF_HNSW_SQ`: best recall/latency trade-off
- `IVF_RQ`: best compression for large, high-dimensional datasets
- `IVF_PQ`: often higher accuracy than `IVF_RQ` for small dimensions (`<= 256`) at similar query performance
- `metrics`: default is `l2`, other available are `cosine` or `dot`
- When using `cosine` similarity, distances range from 0 (identical vectors) to 2 (maximally dissimilar)
- `num_partitions`: use index-specific starting points from the section above:
- `IVF_HNSW_SQ`: `num_rows // 1,048,576`
- HNSW-backed IVF indexes (`IVF_HNSW_FLAT`, `IVF_HNSW_SQ`, `IVF_HNSW_PQ`): `num_rows // 1,048,576`
- `IVF_RQ` and `IVF_PQ`: `num_rows // 4096`
- `num_sub_vectors`: applies to `IVF_PQ`; start with `dimension // 8`. Larger values often improve recall but can slow search.

Expand All @@ -122,7 +124,7 @@ Connect to LanceDB and open the table you want to index.

### 2. Construct an IVF Index

Create an `IVF_PQ` index with `cosine` similarity. Specify `vector_column_name` if you use multiple vector columns or non-default names. You can switch `index_type` to `IVF_RQ` or `IVF_HNSW_SQ` depending on your recall/latency/compression target.
Create an `IVF_PQ` index with `cosine` similarity. Specify `vector_column_name` if you use multiple vector columns or non-default names. You can switch `index_type` to `IVF_RQ`, `IVF_HNSW_SQ`, or `IVF_HNSW_FLAT` depending on your recall/latency/compression target.

<CodeGroup>
<CodeBlock filename="Python" language="Python" icon="python">
Expand All @@ -146,9 +148,9 @@ The previous query uses:

- `limit`: number of results to return
- `nprobes`: number of IVF partitions to scan. LanceDB auto-tunes this by default.
- `ef`: primarily relevant for `IVF_HNSW_SQ`; start around `1.5 * k` (where `k=limit`) and increase up to `10 * k` for higher recall.
- `ef`: primarily relevant for HNSW-backed IVF indexes such as `IVF_HNSW_FLAT` and `IVF_HNSW_SQ`; start around `1.5 * k` (where `k=limit`) and increase up to `10 * k` for higher recall.
- `nprobes` by index type:
- `IVF_HNSW_SQ`: usually keep auto-tuned `nprobes`, then tune `ef` first. For filtered search (`where(...)`), expect higher latency variance.
- `IVF_HNSW_FLAT` and `IVF_HNSW_SQ`: usually keep auto-tuned `nprobes`, then tune `ef` first. For filtered search (`where(...)`), expect higher latency variance.
- `IVF_RQ`: keep auto-tuned `nprobes`; increase only when recall is insufficient.
- `IVF_PQ`: keep auto-tuned `nprobes`; increase when recall is insufficient. Often preferred over `IVF_RQ` when `dimension <= 256`.
- `refine_factor`: reads additional candidates and reranks in memory
Expand All @@ -158,14 +160,17 @@ The previous query uses:

### Index Configuration

There are three key parameters to set when constructing an HNSW index:
There are four key parameters to set when constructing an HNSW index:

- `index_type`: choose `IVF_HNSW_SQ` for a strong recall/latency/size trade-off, or `IVF_HNSW_FLAT` when you want the IVF+HNSW structure without vector quantization.
- `metric`: The default is `l2` euclidean distance metric. Other available are `dot` and `cosine`.
- `m`: The number of neighbors to select for each vector in the HNSW graph.
- `ef_construction`: The number of candidates to evaluate during the construction of the HNSW graph.

### 1. Construct an HNSW Index

The snippet below uses `IVF_HNSW_SQ`. If you want the unquantized variant, change `index_type` to `IVF_HNSW_FLAT`.

<CodeGroup>
<CodeBlock filename="Python" language="Python" icon="python">
{VectorIndexBuildHnsw}
Expand Down
3 changes: 1 addition & 2 deletions docs/search/optimize-queries.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,7 @@ RemoteTake:

| Data Type | Recommended Index | Use Case |
| ----------- | ------------------ | ---------------------------------------- |
| Vector | IVF_PQ/IVF_HNSW_SQ | Approximate nearest neighbor search |
| Vector | IVF_PQ/IVF_HNSW_SQ/IVF_HNSW_FLAT | Approximate nearest neighbor search |
| Scalar | B-Tree | Range queries and sorting |
| Categorical | Bitmap | Multi-value filters and set operations |
| `List` | Label_list | Multi-label classification and filtering |
Expand Down Expand Up @@ -386,4 +386,3 @@ For vector search performance:

- Create ANN index on your vector column(s) as described in the [index guide](/indexing/vector-index/)
- If you often filter by metadata, create [scalar indices](/indexing/scalar-index/) on those columns

5 changes: 3 additions & 2 deletions docs/search/vector-search.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ For indexed search, supported distance metrics vary by index type:
| `IVF_PQ` | `["l2", "cosine", "dot"]` |
| `IVF_SQ` | `["l2", "cosine", "dot"]` |
| `IVF_RQ` | `["l2", "cosine", "dot"]` |
| `IVF_HNSW_FLAT` | `["l2", "cosine", "dot"]` |
| `IVF_HNSW_PQ` | `["l2", "cosine", "dot"]` |
| `IVF_HNSW_SQ` | `["l2", "cosine", "dot"]` |

Expand Down Expand Up @@ -72,7 +73,7 @@ The trade-off is that the results are not guaranteed to be the true nearest neig
Use ANN search for large-scale applications where speed matters more than perfect recall. LanceDB uses approximate nearest neighbor algorithms to deliver fast results without examining every vector in your dataset.

<Warning>
When a vector index is used, `_distance` is not always the true distance between full vectors. In ANN mode without refinement, LanceDB computes `_distance` using compressed vectors for speed.
When a vector index is used, `_distance` is not always the true distance between full vectors. On quantized ANN indexes, LanceDB may compute `_distance` from the compressed representation for speed. Use `refine_factor` when you want reranking on full vectors.
</Warning>

### Exact vs Approximate Distances
Expand All @@ -85,7 +86,7 @@ The table below summarizes the behavior of `_distance` in search results based o
| Query mode | Neighbor quality | `_distance` in results |
| :--- | :--- | :--- |
| No index or `.bypass_vector_index()` | Exact kNN (100% recall) | True distance on full vectors |
| Indexed ANN, no `refine_factor` | Approximate neighbors | Approximate distance on compressed/quantized vectors |
| Indexed ANN, no `refine_factor` | Approximate neighbors | Distance on the index representation: exact for flat indexes, approximate for quantized indexes |
| Indexed ANN + `refine_factor(1)` | Approximate neighbors (same candidate set) | Distances recomputed on full vectors for reranked candidates |
| Indexed ANN + `refine_factor(>1)` | Better recall than no refine (usually) | Distances recomputed on full vectors for reranked candidates |

Expand Down
Loading