Skip to content

Add transformers.js support for offline local embeddings#390

Open
sharshenov wants to merge 2 commits into
arabold:mainfrom
sharshenov:feat/transformers-js-embeddings
Open

Add transformers.js support for offline local embeddings#390
sharshenov wants to merge 2 commits into
arabold:mainfrom
sharshenov:feat/transformers-js-embeddings

Conversation

@sharshenov
Copy link
Copy Markdown

@sharshenov sharshenov commented Apr 19, 2026

Add Transformers.js support for completely local, offline embedding inference on CPU-only servers without requiring Ollama or external APIs.

This PR adds native Transformers.js support, enabling:

  • Zero dependencies: No Ollama, Docker, or external services needed
  • CPU-only inference: Works on any server without GPU
  • Offline operation: Models cache locally after first download
  • Full privacy: All embeddings generated locally, no data leaves your server

@sharshenov sharshenov force-pushed the feat/transformers-js-embeddings branch 2 times, most recently from 9733ded to 71c0aab Compare April 20, 2026 08:55
@sharshenov sharshenov marked this pull request as ready for review April 20, 2026 08:56
@sharshenov sharshenov force-pushed the feat/transformers-js-embeddings branch 2 times, most recently from d2065ce to d08bf80 Compare April 24, 2026 09:29
@arabold arabold requested a review from Copilot April 26, 2026 22:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new local/offline embedding provider powered by Transformers.js so the server can generate embeddings without external services (e.g., Ollama or hosted APIs), targeting CPU-only deployments.

Changes:

  • Introduces TransformersJSEmbeddings (Transformers.js + ONNX) with lazy model initialization and dimension auto-detection.
  • Wires a new transformers: provider into embedding config/factory and DocumentStore initialization.
  • Adds dependency + docs for configuration (including TRANSFORMERS_DEVICE) and caching guidance.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/store/embeddings/TransformersJSEmbeddings.ts New embeddings implementation using @huggingface/transformers pipeline
src/store/embeddings/TransformersJSEmbeddings.test.ts Unit tests with a mocked transformers pipeline
src/store/embeddings/EmbeddingFactory.ts Adds transformers provider and instantiation logic
src/store/embeddings/EmbeddingConfig.ts Extends supported provider union to include transformers
src/store/DocumentStore.ts Uses provider-specific dimension detection for Transformers.js
package.json Adds @huggingface/transformers runtime dependency
package-lock.json Locks new HF/ONNX runtime dependency tree
docs/guides/embedding-models.md Documents new provider usage and environment variables
README.md Notes Docker cache persistence for Transformers.js models

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/store/embeddings/EmbeddingFactory.ts Outdated
Comment thread src/store/embeddings/TransformersJSEmbeddings.ts
Comment thread src/store/embeddings/TransformersJSEmbeddings.ts
Comment thread src/store/DocumentStore.ts Outdated
Comment thread src/store/embeddings/TransformersJSEmbeddings.test.ts
Comment thread package.json Outdated
"@alpinejs/collapse": "^3.15.8",
"@fastify/formbody": "^8.0.2",
"@fastify/static": "^8.3.0",
"@huggingface/transformers": "^4.1.0",
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding @huggingface/transformers as a required dependency pulls in native, install-scripted transitive deps (notably onnxruntime-node and sharp). This can cause install failures or much slower installs for users who never enable the transformers provider.

Consider making Transformers.js support optional via lazy/dynamic import plus optionalDependencies/feature flagging, or splitting it into a separate installable extra so the default install path stays lightweight.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this dependency is optional, then the distribution by docker gets more complicated. Should 2 separate docker images being shipped then?

this.isInitialized = true;
}

return this.encoder!;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This violates our TypeScript/Biome guidelines: non-null assertions are forbidden in this repo, and npm run lint flags this line. Please return a narrowed local value instead, e.g. assign/validate the encoder before returning it.

Comment thread docs/guides/embedding-models.md Outdated
```

**Pre-configured models:**
- `Xenova/bge-small-en-v1.5` (384d, recommended default)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low priority: these local models are 384d/768d, while the default database vector dimension remains 1536. It would be helpful to mention that users can set embeddings.vectorDimension / DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION to match the selected local model and avoid padding/storage overhead.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot! I can try to detect the models dimensions early and auto-configure the server while still allowing to override the setting (as it is currently)

@arabold
Copy link
Copy Markdown
Owner

arabold commented Apr 26, 2026

Thanks for the contribution! This is a great addition overall and fits really well with the local/offline story of the project.

Besides the inline comments already left, there are a few smaller gaps that would be helpful to address in the next update:

  • Add tests for the new factory/provider wiring, especially transformers:* model creation and areCredentialsAvailable("transformers").
  • Add coverage for the model-name normalization behavior so arbitrary Hugging Face repo IDs keep working.
  • Avoid any in the new test mock if possible, to stay aligned with our TypeScript guidelines.
  • If feasible, add a small DocumentStore test around the Transformers.js dimension-detection path.

Thank you for putting this together.

@sharshenov sharshenov force-pushed the feat/transformers-js-embeddings branch 4 times, most recently from 8868afb to ff2fa85 Compare May 17, 2026 17:14
"sentence-transformers/static-similarity-mrl-multilingual-v1": 1024,
"manu/sentence_croissant_alpha_v0.3": 2048,
"BAAI/bge-small-en-v1.5": 512,
"BAAI/bge-small-en-v1.5": 384,
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharshenov sharshenov force-pushed the feat/transformers-js-embeddings branch from ff2fa85 to 8c48f15 Compare May 17, 2026 17:28
@sharshenov sharshenov requested a review from arabold May 17, 2026 17:28
@sharshenov sharshenov marked this pull request as draft May 17, 2026 18:47
@sharshenov
Copy link
Copy Markdown
Author

sharshenov commented May 17, 2026

I just found that the model is cached in a wrong dir in docker as if the TRANSFORMERS_CACHE is not respected. I'll investigate and fix it later

UPD: fixed

@sharshenov sharshenov force-pushed the feat/transformers-js-embeddings branch from 8c48f15 to a94ffab Compare May 17, 2026 20:43
@sharshenov sharshenov force-pushed the feat/transformers-js-embeddings branch from 8620ac9 to eea0a76 Compare May 18, 2026 10:05
@sharshenov sharshenov marked this pull request as ready for review May 18, 2026 10:07
@arabold arabold requested a review from Copilot May 20, 2026 13:07
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 18 changed files in this pull request and generated 12 comments.

Comment thread src/store/embeddings/TransformersJSEmbeddings.ts Outdated
Comment on lines +14 to 15
import { TransformersJSEmbeddings } from "./TransformersJSEmbeddings";

Comment on lines 13 to 18
ModelConfigurationError,
UnsupportedProviderError,
} from "./embeddings/EmbeddingFactory";
import { FixedDimensionEmbeddings } from "./embeddings/FixedDimensionEmbeddings";
import { TransformersJSEmbeddings } from "./embeddings/TransformersJSEmbeddings";
import {
Comment on lines +639 to +649
// For Transformers.js, use the built-in dimension detection with timeout
const dimensionPromise = this.embeddings.getVectorDimension();
let timeoutId: NodeJS.Timeout | undefined;
const timeoutPromise = new Promise<never>((_, reject) => {
timeoutId = setTimeout(() => {
reject(
new Error(
`Embedding service connection timed out after ${this.embeddingInitTimeoutMs / 1000} seconds`,
),
);
}, this.embeddingInitTimeoutMs);
Comment thread src/utils/config.ts
Comment on lines +492 to +516
// Smart default for vectorDimension based on embedding provider
// Only applies when DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION is not explicitly set
// and the model is a transformers provider (avoids 1536d padding for 384d/768d models)
if (
!process.env.DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION &&
getAtPath(mergedInput, ["embeddings", "vectorDimension"]) ===
DEFAULT_CONFIG.embeddings.vectorDimension
) {
const modelSpec = getAtPath(mergedInput, ["app", "embeddingModel"]);
if (typeof modelSpec === "string" && modelSpec) {
const normalizedSpec = normalizeEnvValue(modelSpec);
const colonIndex = normalizedSpec.indexOf(":");
const provider =
colonIndex === -1 ? "openai" : normalizedSpec.substring(0, colonIndex);

if (provider === "transformers") {
const model =
colonIndex === -1 ? normalizedSpec : normalizedSpec.substring(colonIndex + 1);
const knownDims = EmbeddingConfig.getKnownModelDimensions(model);
if (knownDims !== null) {
setAtPath(mergedInput, ["embeddings", "vectorDimension"], knownDims);
}
}
}
}
Comment on lines +27 to +38
vi.mock("./embeddings/TransformersJSEmbeddings", async () => {
const actual = await vi.importActual<
typeof import("./embeddings/TransformersJSEmbeddings")
>("./embeddings/TransformersJSEmbeddings");
return {
TransformersJSEmbeddings: class extends (actual as any).TransformersJSEmbeddings {
async getVectorDimension() {
return mockGetVectorDimension();
}
async embedQuery(text: string) {
return mockEmbedQuery(text);
}
| `requestTimeoutMs` | `30000` | Timeout for each embedding API request (ms). |
| `initTimeoutMs` | `30000` | Timeout for the initial test embedding during model initialization (ms). |
| `vectorDimension` | `1536` | Dimension of the vector space. Must be a positive integer (minimum 1). Override with `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION`. Changing this value triggers a model change confirmation on next startup. |
| `vectorDimension` | `1536` (or model-specific for `transformers:` models) | Dimension of the vector space. Must be a positive integer (minimum 1). For Transformers.js models, the server auto-detects the model's native dimension (e.g., 384 for `bge-small-en-v1.5`). Override with `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION`. Changing this value triggers a model change confirmation on next startup. |
### Vector Dimension Override

The vector dimension defaults to the model's native dimension (e.g., 1536 for `text-embedding-3-small`). You can override it with `embeddings.vectorDimension` in the config file or `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION` as an environment variable. The value must be a positive integer (minimum 1).
The vector dimension defaults to the model's native dimension (e.g., 1536 for `text-embedding-3-small`, 384 for `BAAI/bge-small-en-v1.5`). For Transformers.js models not from the list of known models, it is auto-detected via inference. For other providers, it is discovered by embedding a test string. You can override it with `embeddings.vectorDimension` in the config file or `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION` as an environment variable. The value must be a positive integer (minimum 1).
Embeddings stored as BLOB in documents table:

- 1536-dimensional vectors by default (configurable via `embeddings.vectorDimension`)
- 1536-dimensional vectors by default or model-specific for `transformers:` models, e.g., 384 for `BAAI/bge-small-en-v1.5`, (configurable via `embeddings.vectorDimension`)
Comment thread README.md
Comment on lines +121 to +123
**Note:** When using Transformers.js embeddings, models are cached inside the container in /models. Mount the volume to persist the models cache:
```bash
-v docs-mcp-models:/models
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants