Add transformers.js support for offline local embeddings by sharshenov · Pull Request #390 · arabold/docs-mcp-server

sharshenov · 2026-04-19T21:49:55Z

Add Transformers.js support for completely local, offline embedding inference on CPU-only servers without requiring Ollama or external APIs.

This PR adds native Transformers.js support, enabling:

Zero dependencies: No Ollama, Docker, or external services needed
CPU-only inference: Works on any server without GPU
Offline operation: Models cache locally after first download
Full privacy: All embeddings generated locally, no data leaves your server

Copilot

Pull request overview

Adds a new local/offline embedding provider powered by Transformers.js so the server can generate embeddings without external services (e.g., Ollama or hosted APIs), targeting CPU-only deployments.

Changes:

Introduces TransformersJSEmbeddings (Transformers.js + ONNX) with lazy model initialization and dimension auto-detection.
Wires a new transformers: provider into embedding config/factory and DocumentStore initialization.
Adds dependency + docs for configuration (including TRANSFORMERS_DEVICE) and caching guidance.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/store/embeddings/TransformersJSEmbeddings.ts	New embeddings implementation using `@huggingface/transformers` pipeline
src/store/embeddings/TransformersJSEmbeddings.test.ts	Unit tests with a mocked transformers pipeline
src/store/embeddings/EmbeddingFactory.ts	Adds `transformers` provider and instantiation logic
src/store/embeddings/EmbeddingConfig.ts	Extends supported provider union to include `transformers`
src/store/DocumentStore.ts	Uses provider-specific dimension detection for Transformers.js
package.json	Adds `@huggingface/transformers` runtime dependency
package-lock.json	Locks new HF/ONNX runtime dependency tree
docs/guides/embedding-models.md	Documents new provider usage and environment variables
README.md	Notes Docker cache persistence for Transformers.js models

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T22:06:10Z

    "@alpinejs/collapse": "^3.15.8",
    "@fastify/formbody": "^8.0.2",
    "@fastify/static": "^8.3.0",
+    "@huggingface/transformers": "^4.1.0",


Adding @huggingface/transformers as a required dependency pulls in native, install-scripted transitive deps (notably onnxruntime-node and sharp). This can cause install failures or much slower installs for users who never enable the transformers provider.

Consider making Transformers.js support optional via lazy/dynamic import plus optionalDependencies/feature flagging, or splitting it into a separate installable extra so the default install path stays lightweight.

If this dependency is optional, then the distribution by docker gets more complicated. Should 2 separate docker images being shipped then?

arabold · 2026-04-26T22:10:22Z

+      this.isInitialized = true;
+    }
+
+    return this.encoder!;


This violates our TypeScript/Biome guidelines: non-null assertions are forbidden in this repo, and npm run lint flags this line. Please return a narrowed local value instead, e.g. assign/validate the encoder before returning it.

arabold · 2026-04-26T22:10:22Z

+```
+
+**Pre-configured models:**
+- `Xenova/bge-small-en-v1.5` (384d, recommended default)


Low priority: these local models are 384d/768d, while the default database vector dimension remains 1536. It would be helpful to mention that users can set embeddings.vectorDimension / DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION to match the selected local model and avoid padding/storage overhead.

Good spot! I can try to detect the models dimensions early and auto-configure the server while still allowing to override the setting (as it is currently)

arabold · 2026-04-26T22:12:52Z

Thanks for the contribution! This is a great addition overall and fits really well with the local/offline story of the project.

Besides the inline comments already left, there are a few smaller gaps that would be helpful to address in the next update:

Add tests for the new factory/provider wiring, especially transformers:* model creation and areCredentialsAvailable("transformers").
Add coverage for the model-name normalization behavior so arbitrary Hugging Face repo IDs keep working.
Avoid any in the new test mock if possible, to stay aligned with our TypeScript guidelines.
If feasible, add a small DocumentStore test around the Transformers.js dimension-detection path.

Thank you for putting this together.

sharshenov · 2026-05-17T17:15:02Z

    "sentence-transformers/static-similarity-mrl-multilingual-v1": 1024,
    "manu/sentence_croissant_alpha_v0.3": 2048,
-    "BAAI/bge-small-en-v1.5": 512,
+    "BAAI/bge-small-en-v1.5": 384,


https://huggingface.co/BAAI/bge-small-en-v1.5

sharshenov · 2026-05-17T18:48:50Z

I just found that the model is cached in a wrong dir in docker as if the TRANSFORMERS_CACHE is not respected. I'll investigate and fix it later

UPD: fixed

…dings

Copilot

Pull request overview

Copilot reviewed 16 out of 18 changed files in this pull request and generated 12 comments.

+import { TransformersJSEmbeddings } from "./TransformersJSEmbeddings";



  ModelConfigurationError,
  UnsupportedProviderError,
 } from "./embeddings/EmbeddingFactory";
 import { FixedDimensionEmbeddings } from "./embeddings/FixedDimensionEmbeddings";
+import { TransformersJSEmbeddings } from "./embeddings/TransformersJSEmbeddings";
 import {


+        // For Transformers.js, use the built-in dimension detection with timeout
+        const dimensionPromise = this.embeddings.getVectorDimension();
+        let timeoutId: NodeJS.Timeout | undefined;
+        const timeoutPromise = new Promise<never>((_, reject) => {
+          timeoutId = setTimeout(() => {
+            reject(
+              new Error(
+                `Embedding service connection timed out after ${this.embeddingInitTimeoutMs / 1000} seconds`,
+              ),
+            );
+          }, this.embeddingInitTimeoutMs);


+  // Smart default for vectorDimension based on embedding provider
+  // Only applies when DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION is not explicitly set
+  // and the model is a transformers provider (avoids 1536d padding for 384d/768d models)
+  if (
+    !process.env.DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION &&
+    getAtPath(mergedInput, ["embeddings", "vectorDimension"]) ===
+      DEFAULT_CONFIG.embeddings.vectorDimension
+  ) {
+    const modelSpec = getAtPath(mergedInput, ["app", "embeddingModel"]);
+    if (typeof modelSpec === "string" && modelSpec) {
+      const normalizedSpec = normalizeEnvValue(modelSpec);
+      const colonIndex = normalizedSpec.indexOf(":");
+      const provider =
+        colonIndex === -1 ? "openai" : normalizedSpec.substring(0, colonIndex);
+
+      if (provider === "transformers") {
+        const model =
+          colonIndex === -1 ? normalizedSpec : normalizedSpec.substring(colonIndex + 1);
+        const knownDims = EmbeddingConfig.getKnownModelDimensions(model);
+        if (knownDims !== null) {
+          setAtPath(mergedInput, ["embeddings", "vectorDimension"], knownDims);
+        }
+      }
+    }
+  }


+vi.mock("./embeddings/TransformersJSEmbeddings", async () => {
+  const actual = await vi.importActual<
+    typeof import("./embeddings/TransformersJSEmbeddings")
+  >("./embeddings/TransformersJSEmbeddings");
+  return {
+    TransformersJSEmbeddings: class extends (actual as any).TransformersJSEmbeddings {
+      async getVectorDimension() {
+        return mockGetVectorDimension();
+      }
+      async embedQuery(text: string) {
+        return mockEmbedQuery(text);
+      }


 | `requestTimeoutMs` | `30000` | Timeout for each embedding API request (ms). |
 | `initTimeoutMs` | `30000` | Timeout for the initial test embedding during model initialization (ms). |
-| `vectorDimension` | `1536` | Dimension of the vector space. Must be a positive integer (minimum 1). Override with `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION`. Changing this value triggers a model change confirmation on next startup. |
+| `vectorDimension` | `1536` (or model-specific for `transformers:` models) | Dimension of the vector space. Must be a positive integer (minimum 1). For Transformers.js models, the server auto-detects the model's native dimension (e.g., 384 for `bge-small-en-v1.5`). Override with `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION`. Changing this value triggers a model change confirmation on next startup. |


 ### Vector Dimension Override

-The vector dimension defaults to the model's native dimension (e.g., 1536 for `text-embedding-3-small`). You can override it with `embeddings.vectorDimension` in the config file or `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION` as an environment variable. The value must be a positive integer (minimum 1).
+The vector dimension defaults to the model's native dimension (e.g., 1536 for `text-embedding-3-small`, 384 for `BAAI/bge-small-en-v1.5`). For Transformers.js models not from the list of known models, it is auto-detected via inference. For other providers, it is discovered by embedding a test string. You can override it with `embeddings.vectorDimension` in the config file or `DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION` as an environment variable. The value must be a positive integer (minimum 1).


 Embeddings stored as BLOB in documents table:

- 1536-dimensional vectors by default (configurable via `embeddings.vectorDimension`)
+- 1536-dimensional vectors by default or model-specific for `transformers:` models, e.g., 384 for `BAAI/bge-small-en-v1.5`, (configurable via `embeddings.vectorDimension`)


+**Note:** When using Transformers.js embeddings, models are cached inside the container in /models. Mount the volume to persist the models cache:
+```bash
+-v docs-mcp-models:/models


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

sharshenov force-pushed the feat/transformers-js-embeddings branch 2 times, most recently from 9733ded to 71c0aab Compare April 20, 2026 08:55

sharshenov marked this pull request as ready for review April 20, 2026 08:56

sharshenov force-pushed the feat/transformers-js-embeddings branch 2 times, most recently from d2065ce to d08bf80 Compare April 24, 2026 09:29

arabold requested a review from Copilot April 26, 2026 22:00

Copilot started reviewing on behalf of arabold April 26, 2026 22:00 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

arabold reviewed Apr 26, 2026

View reviewed changes

sharshenov force-pushed the feat/transformers-js-embeddings branch 4 times, most recently from 8868afb to ff2fa85 Compare May 17, 2026 17:14

sharshenov commented May 17, 2026

View reviewed changes

sharshenov force-pushed the feat/transformers-js-embeddings branch from ff2fa85 to 8c48f15 Compare May 17, 2026 17:28

sharshenov requested a review from arabold May 17, 2026 17:28

sharshenov marked this pull request as draft May 17, 2026 18:47

sharshenov force-pushed the feat/transformers-js-embeddings branch from 8c48f15 to a94ffab Compare May 17, 2026 20:43

feat(embeddings): add transformers.js support for offline local embed…

eea0a76

…dings

sharshenov force-pushed the feat/transformers-js-embeddings branch from 8620ac9 to eea0a76 Compare May 18, 2026 10:05

sharshenov marked this pull request as ready for review May 18, 2026 10:07

arabold requested a review from Copilot May 20, 2026 13:07

Copilot started reviewing on behalf of arabold May 20, 2026 13:08 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Potential fix for pull request finding

d9acf72

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

		import { TransformersJSEmbeddings } from "./TransformersJSEmbeddings";

Conversation

sharshenov commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

sharshenov May 17, 2026

Choose a reason for hiding this comment

Uh oh!

arabold Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

arabold Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

sharshenov May 16, 2026

Choose a reason for hiding this comment

Uh oh!

arabold commented Apr 26, 2026

Uh oh!

sharshenov May 17, 2026

Choose a reason for hiding this comment

Uh oh!

sharshenov commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sharshenov commented Apr 19, 2026 •

edited

Loading

sharshenov commented May 17, 2026 •

edited

Loading