Skip to content

IBX-9846: Describe Embeddings search API#3029

Open
dabrt wants to merge 4 commits into5.0from
IBX-9846
Open

IBX-9846: Describe Embeddings search API#3029
dabrt wants to merge 4 commits into5.0from
IBX-9846

Conversation

@dabrt
Copy link
Contributor

@dabrt dabrt commented Jan 30, 2026

Question Answer
JIRA Ticket IBX-9846
Versions 4.6, 5.0
Edition All

Describe Embeddings search API

Checklist

  • Text renders correctly
  • Text has been checked with vale
  • Description metadata is up to date
  • PHP code samples have been fixed with PHP CS fixer
  • Added link to this PR in relevant JIRA ticket or code PR

@github-actions
Copy link

github-actions bot commented Jan 30, 2026


## Validation

- [`Ibexa\Contracts\Core\Repository\Values\Content\QueryValidatorInterface`](/api/php_api/php_api_reference/classes/Ibexa-Contracts-Core-Repository-Values-Content-QueryValidatorInterface.html): Validates embedding queries and configurations are validated before reaching the search engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not validate configuration. Maybe Validates embedding queries before they reach the search engine


Embeddings are numerical representations that capture the meaning of text, images, or other content.
Embeddings are generated by AI by converting words or documents into lists of numbers, instead of treating them as plain text.
Such lists, aka. vectors, can then be compared to find content with similar meaning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Such lists, aka. vectors, can then be compared to find content with similar meaning.
Such lists, a.k.a. vectors, can then be compared to find content with similar meaning.


!!! note "Taxonomy suggestions"

Embedding queries have been introduced primarily to support the [Taxonomy suggestions](taxonomy.md#taxonomy-suggestions) feature but you use them in other scenarios.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Embedding queries have been introduced primarily to support the [Taxonomy suggestions](taxonomy.md#taxonomy-suggestions) feature but you use them in other scenarios.
Embedding queries have been introduced primarily to support the [Taxonomy suggestions](taxonomy.md#taxonomy-suggestions) feature but you can use them in other scenarios.


Embedding queries have been introduced primarily to support the [Taxonomy suggestions](taxonomy.md#taxonomy-suggestions) feature but you use them in other scenarios.

Searching with embeddings can be combined with traditional search criteria and filters, which allows the semantic search to be constrained by content type, location, permissions, or other search criteria.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little bit misleading. EmbeddingQuery doesn’t support query criteria, only filter. Maybe can be combined with filters (Criterion) ...

Searching with embeddings can be combined with traditional search criteria and filters, which allows the semantic search to be constrained by content type, location, permissions, or other search criteria.

An embedding query is represented by the `Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQuery` value object.
The object encapsulates the vector to search for, along with configuration such as the embedding model and similarity threshold.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EmbeddingQuery has no model or threshold fields. It only has embedding, filter, limit/offset, aggregations, performCount.


``` php
use Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQueryBuilder;
use Ibexa\Contracts\Core\Repository\Values\Content\Embedding;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no such thing like Ibexa\Contracts\Core\Repository\Values\Content\Embedding, maybe you were thinking about use Ibexa\Contracts\Core\Repository\Values\Content\Query\Embedding;

But Embedding is abstract and expects float[], so this snippet won’t compile

Suggested change
use Ibexa\Contracts\Core\Repository\Values\Content\Embedding;
use Ibexa\Contracts\Taxonomy\Search\Query\Value\TaxonomyEmbedding

use Ibexa\Contracts\Core\Repository\Values\Content\Query\Aggregation;

// Create an embedding object that represents the search input
$embedding = new Embedding('Find content similar to this text');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class accepts

    /**
     * @param float[] $value
     */
    public function __construct(array $value)
    {
        $this->value = $value;
    }

``` php
use Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQueryBuilder;
use Ibexa\Contracts\Core\Repository\Values\Content\Embedding;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Aggregation;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggregation is an interface, cannot be instantiated

->setLimit(10) // maximum number of results
->setOffset(0) // result offset for pagination
->setPerformCount(true) // optionally count total matching items
->setAggregations([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could remove aggregations from the snippet (focus on embeddings) or:

      ->setAggregations([
          new ContentTypeTermAggregation('count_by_type'),
      ])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants