Feat/support multimodal embedding #29115

JohnJyong · 2025-12-04T03:25:39Z

Summary

Why Build a Multimodal Knowledge Base

As enterprises increasingly rely on internal knowledge systems, the need to retrieve information from large volumes of heterogeneous files continues to grow. These materials often span multiple formats, including text, images, documents, videos, and audio—for example: product photos, illustrated manuals, or reports containing mixed text and graphics.

Traditional embedding models can vectorize certain types of data, but they are usually limited to a single modality. This limitation forces organizations into one of two sub-optimal solutions:
• Building complex cross-modality pipelines, embedding each modality separately and then manually fusing results;
• Restricting applications to a single modality, leaving most of the data’s value untapped.

Furthermore, for content that naturally includes multiple modalities—such as documents containing both text and images—traditional models struggle to capture the deep relationships between modalities, resulting in incomplete understanding.

For these reasons, multimodal embeddings have become essential for enterprises seeking to enhance data comprehension, unify data processing workflows, and overcome the constraints of single-modality systems.

⸻

Supported Multimodal Embedding & Rerank Models (First Release)

AWS Bedrock
1. nova-2-multimodal-embeddings-v1:0

Google Vertex AI
1. multimodalembedding@001

Jina
1. jina-embedding-v4
2. jina-clip-v1
3. jina-clip-v2
4. jina-reranker-m0 (rerank model)

Tongyi (Alibaba Cloud)
1. multimodal-embedding-v1

Screenshots

PR for plugin: langgenius/dify-plugin-daemon#503
PR for SDK: langgenius/dify-plugin-sdks#237

Checklist

This change requires a documentation update, included: Dify Document
I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

# Conflicts: # api/models/dataset.py

# Conflicts: # api/tests/test_containers_integration_tests/tasks/test_add_document_to_index_task.py

…g' into feat/support-multimodal-embedding

docker/.env.example

api/.env.example

crazywoola

See comments

# Conflicts: # api/controllers/console/datasets/datasets.py # api/controllers/console/datasets/datasets_segments.py # api/controllers/console/datasets/hit_testing_base.py

JohnJyong added 30 commits October 28, 2025 17:00

support multimodal embedding

5c0d50d

support multimodal embedding

33ac54f

support multimodal embedding

070d826

multimodal embedding

6a95b23

multimodal Embedding

3b22d44

multimodal Embedding

98b25f2

multimodal Embedding

8e8d926

multimodal Embedding

779bed3

multimodal Embedding LLM node

aa94c68

multimodal Embedding LLM node

37e8050

Merge branch 'main' into feat/support-multimodal-embedding

8d29b83

# Conflicts: # api/models/dataset.py

migration

734378f

multimodal embedding

14be2a8

Merge branch 'main' into feat/support-multimodal-embedding

3f61fc7

# Conflicts: # api/tests/test_containers_integration_tests/tasks/test_add_document_to_index_task.py

multimodal embedding

9bbb332

multimodal embedding json schema

0456bae

multimodal embedding json schema

9350c22

multimodal embedding update segment

a3876c3

multimodal embedding update segment

3495358

multimodal embedding update segment

457cdf3

multimodal embedding update segment

f21b081

multimodal embedding update segment

462a3b3

multimodal embedding update segment

608a374

multimodal embedding update segment

2052cbc

multimodal embedding update segment

f51bfc5

multimodal embedding update segment

abd604f

multimodal embedding update segment

0583cc1

multimodal embedding update segment

997b9e3

multimodal embedding update segment

530120a

multimodal embedding update segment

5a6b8bb

JohnJyong and others added 11 commits December 4, 2025 14:30

lint fix

1d4858a

lint fix

27a6e5b

lint fix

02efe29

[autofix.ci] apply automated fixes

7e0947d

[autofix.ci] apply automated fixes (attempt 2/3)

d8c9e3a

lint fix

cc9df0f

Merge remote-tracking branch 'origin/feat/support-multimodal-embeddin…

068da9f

…g' into feat/support-multimodal-embedding

[autofix.ci] apply automated fixes

d19d2c4

lint fix

1df232e

add env

0937816

add env

9860971

JohnJyong requested a review from crazywoola as a code owner December 5, 2025 04:16

JohnJyong and others added 8 commits December 5, 2025 14:04

add env

9056122

[autofix.ci] apply automated fixes

9844532

add env

718e08d

Merge remote-tracking branch 'origin/feat/support-multimodal-embeddin…

3f5d08c

…g' into feat/support-multimodal-embedding

[autofix.ci] apply automated fixes

8744a07

add env

13d7534

Merge remote-tracking branch 'origin/feat/support-multimodal-embeddin…

0b6146d

…g' into feat/support-multimodal-embedding

fix mypy

697f2d4

crazywoola reviewed Dec 8, 2025

View reviewed changes

docker/.env.example Show resolved Hide resolved

crazywoola reviewed Dec 8, 2025

View reviewed changes

api/.env.example Show resolved Hide resolved

crazywoola reviewed Dec 8, 2025

View reviewed changes

JohnJyong and others added 7 commits December 8, 2025 16:08

Merge branch 'main' into feat/support-multimodal-embedding

33544eb

# Conflicts: # api/controllers/console/datasets/datasets.py # api/controllers/console/datasets/datasets_segments.py # api/controllers/console/datasets/hit_testing_base.py

fix mypy

f75bf87

Update .env.example

107933c

Update .env.example

6e5749d

Update docker-compose.yaml

fde0b10

fix mypy

77c0768

fix mypy

02206dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/support multimodal embedding #29115

Feat/support multimodal embedding #29115

JohnJyong commented Dec 4, 2025 •

edited by QuantumGhost

Loading

Uh oh!

Uh oh!

Uh oh!

crazywoola left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feat/support multimodal embedding #29115

Are you sure you want to change the base?

Feat/support multimodal embedding #29115

Conversation

JohnJyong commented Dec 4, 2025 • edited by QuantumGhost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Screenshots

Checklist

Uh oh!

Uh oh!

Uh oh!

crazywoola left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JohnJyong commented Dec 4, 2025 •

edited by QuantumGhost

Loading