Support for Supplying and Persisting Row IDs During Index Build (Instead of JVector-Generated Node IDs)

We would like JVector to support passing row IDs from the source table during index creation, and to store these row IDs directly inside the index.
This removes the need for JVector to generate its own node IDs and eliminates the external RowID ↔ NodeID mapping file.

**Problem / Motivation**

Currently:
- JVector automatically assigns node IDs internally when adding vectors.
- Presto/Iceberg uses row IDs as the authoritative identifier for retrieving data.
- Because JVector does not accept row IDs, we must maintain an external mapping file linking NodeID ↔ RowID.
- This adds extra complexity in:
    - index creation
    - storing and syncing mapping files
    - converting node IDs back to row IDs during ANN search
    - maintaining consistency when rebuilding or updating indexes

If JVector could store row IDs directly, we would no longer need node IDs at all.

**Requested Functionality**
We would like JVector to support:

1.  **Passing Row IDs when building vectors**
     Example:
`   ImmutableGraphIndex index = builder.build(ravv, row_id);
`
2. **JVector internally stores row IDs in the index,** instead of generating node IDs.
3. **ANN search results return row IDs directly**
     Example:
     `for (SearchResult.NodeScore ns : result.getNodes()) ` 
     `{
                    row_id = ns.row_id;
                }`

**Benefits**
- Removes the need for node IDs entirely.
- Eliminates external mapping files and synchronization overhead.
- Simplifies Presto + Iceberg integration significantly.
- Makes search results directly usable for table lookups.
- Reduces error potential and improves performance by avoiding mapping indirection.

**Additional Notes**

We are integrating JVector with Presto for vector search.
Native support for row IDs inside the index will make the integration simpler, cleaner, and more reliable.

We are happy to provide sample workflows or further details as needed.
  



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for Supplying and Persisting Row IDs During Index Build (Instead of JVector-Generated Node IDs) #575

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for Supplying and Persisting Row IDs During Index Build (Instead of JVector-Generated Node IDs) #575

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions