Skip to content

Add batch vector insertion API to C ABI (FFI) #9

@tanvincible

Description

@tanvincible

Description:

The current C ABI exposes only single-vector insertion:

uint64_t chassis_add(ChassisIndex*, const float* vector, size_t len);

For high-throughput ingestion and minimizing FFI overhead, we need a batch insertion interface.

FFI call overhead dominates when inserting vectors one-by-one from Python / Node / C++.

Proposed API:

size_t chassis_add_batch(
    ChassisIndex* index,
    const float* vectors,   // contiguous array
    size_t count,           // number of vectors
    size_t dim,             // dimension per vector
    uint64_t* out_ids       // output buffer of size >= count
);

Semantics:

  • Inserts count vectors of dimension dim

  • Vectors are laid out as:

    [v0_0, v0_1, ..., v0_dim-1,
     v1_0, v1_1, ..., v1_dim-1,
     ...]
    
  • Returns number of vectors successfully inserted

  • On partial failure:

    • Stops at first error
    • Sets last error message
    • Returns number inserted so far

Why This Matters:

  • Enables efficient Python bindings
  • Enables bulk ingestion pipelines
  • Reduces FFI crossings by 10–100x
  • Aligns with existing internal batch-friendly storage and graph logic

Implementation Sketch:

  • Validate pointers and dimensions

  • Loop over slices:

    for i in 0..count {
        let slice = &vectors[i*dim .. (i+1)*dim];
        index.add(slice)?;
    }
  • Fill out_ids[i] with returned ID

Acceptance Criteria:

  • New C ABI function implemented
  • Header generated via cbindgen
  • FFI tests covering batch insert
  • No performance regression for single insert path

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions