Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Feb 3, 2026

Why make this change?

Internal DAB system for text embedding/vectorization to support future parameter substitution and Redis semantic search features.

What is this change?

Configuration (runtime.embeddings)

  • enabled: Master toggle (default: true)
  • provider: azure-openai | openai
  • base-url, api-key, model: Provider connection (supports @env())
  • api-version, dimensions, timeout-ms: Optional tuning
  • endpoint.enabled/path/roles: Optional REST endpoint at configured path (default: /embed)
  • health.enabled/threshold-ms/test-text/expected-dimensions: Health check config

Core Components

  • IEmbeddingService with TryEmbedAsync() pattern - returns result objects, no exceptions
  • EmbeddingService - HTTP client with FusionCache L1 (24h TTL, SHA256 hash keys with provider/model included)
  • EmbeddingsOptionsConverterFactory - Custom JSON deserializer with env var replacement
  • EmbeddingTelemetryHelper - OpenTelemetry metrics/spans for latency, cache hits, dimensions
  • EmbeddingController - REST endpoint for /embed with role-based authorization

Health Check Implementation

  • HealthCheckHelper.UpdateEmbeddingsHealthCheckResultsAsync() - Executes test embedding with threshold validation
  • Validates response time against health.threshold-ms
  • Validates dimensions against health.expected-dimensions if specified
  • Reports Healthy/Unhealthy status in comprehensive health check report
  • ConfigurationDetails includes Embeddings and EmbeddingsEndpoint status

REST Endpoint Implementation

  • EmbeddingController serves POST requests at configured path (default: /embed)
  • Accepts plain text input and returns comma-separated float values (plain text)
  • Role-based authorization using X-MS-API-ROLE header
  • In development mode, defaults to anonymous access
  • In production mode, requires explicit role configuration

Validation & Safety

  • Constructor validates Azure OpenAI requires model/deployment name
  • Constructor validates required fields (BaseUrl, ApiKey)
  • Cache keys include provider and model to prevent collisions
  • Validates non-empty embedding arrays from API responses

Telemetry Integration

  • TryEmbedAsync and TryEmbedBatchAsync instrumented with activity spans and metrics
  • Cache hit/miss tracking in batch operations
  • API call duration and error tracking

Integration Points

  • Health check report includes embeddings status in comprehensive checks
  • Hot reload via EMBEDDINGS_CONFIG_CHANGED event
  • Startup logging when embeddings configured
  • CLI: dab configure --runtime.embeddings.*

Code Organization

  • Azure.DataApiBuilder.Config.ObjectModel.Embeddings - Config models
  • Azure.DataApiBuilder.Core.Services.Embeddings - Service, telemetry, interface
  • Azure.DataApiBuilder.Service.Controllers - EmbeddingController

How was this tested?

  • Integration Tests
  • Unit Tests

17 unit tests covering deserialization, serialization, TryEmbed pattern, enable/disable behavior, env var replacement.

Sample Request(s)

Configuration:

{
  "runtime": {
    "embeddings": {
      "enabled": true,
      "provider": "azure-openai",
      "base-url": "@env('EMBEDDINGS_ENDPOINT')",
      "api-key": "@env('EMBEDDINGS_API_KEY')",
      "model": "text-embedding-ada-002",
      "endpoint": {
        "enabled": true,
        "path": "/embed",
        "roles": ["authenticated"]
      },
      "health": {
        "enabled": true,
        "threshold-ms": 5000,
        "test-text": "health check"
      }
    }
  }
}

Embed Endpoint:

curl -X POST http://localhost:5000/embed \
  -H "Content-Type: text/plain" \
  -H "X-MS-API-ROLE: authenticated" \
  -d "Hello, world!"

Response:

0.123456,0.234567,0.345678,...
Original prompt

This section details on the original issue you should resolve

<issue_title>[Enh]: Internal subsystem for embedding text.</issue_title>
<issue_description>@robertopc1

What?

An internal DAB system capable of embedding/vectorizing text.

Why?

For future support with parameter substitution and Redis semantic search.

Goals

  • One abstraction hiding provider differences
  • Minimal configuration
  • Works for OpenAI and Azure OpenAI
  • Batch-capable and resilient

Configuration

{
  "runtime": {
    "embeddings": {
      "provider": "azure-openai",
      "endpoint": "@env('EMBEDDINGS_ENDPOINT')",
      "api-key": "@env('EMBEDDINGS_API_KEY')",
      "model": "@env('EMBEDDINGS_MODEL')",
      "api-version": "2024-02-01",
      "dimensions": 1536,
      "timeout-ms": 30000
    }
  }
}

Parameters

Parameter Required Default Description
provider Yes - azure-openai or openai
endpoint Yes - Provider base URL
api-key Yes - Authentication key
model Yes - Model/deployment name
api-version Azure only 2024-02-01 Azure API version (prevent deprecation issues)
dimensions No Model default Output vector size (for Redis schema alignment)
timeout-ms No 30000 Request timeout in milliseconds

Provider Details

Azure OpenAI

URL: POST {endpoint}/openai/deployments/{model}/embeddings?api-version={api-version}

Headers:

api-key: {api-key}

Body:

{
  "input": "text" | ["text1", "text2"],
  "dimensions": 1536
}

Notes:

  • model is the deployment name
  • dimensions optional, omit to use model default

OpenAI

URL: POST {endpoint}/v1/embeddings

Headers:

Authorization: Bearer {api-key}

Body:

{
  "model": "text-embedding-3-small",
  "input": "text" | ["text1", "text2"],
  "dimensions": 1536
}

Notes:

  • dimensions optional for newer models only

Implementation Notes

Input handling:

  • Accept string or string[] for batch efficiency
  • Return float[] or float[][] matching input shape

Resilience:

  • Respect timeout-ms configuration
  • Consider retry with exponential backoff (future enhancement)

DI pattern:

  • Inject as IEmbeddingService with single Embed(input) method
  • Factory pattern if multi-provider support needed (unlikely)

Model defaults:
Recommend sensible defaults when model not specified:

  • OpenAI: text-embedding-3-small
  • Azure OpenAI: Require explicit deployment name (no default)</issue_description>

<agent_instructions>Implement with clean code in mind. Completely atomic and isolated.</agent_instructions>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI changed the title [WIP] Add internal subsystem for embedding text Add internal subsystem for embedding text Feb 3, 2026
Copilot AI requested a review from JerryNixon February 3, 2026 21:29
@JerryNixon
Copy link
Contributor

@copilot

Architecture & Design Suggestions

1. Configuration Property Naming Inconsistency

The JSON schema uses "endpoint" while the code uses BaseUrl. Consider:

// Current: Inconsistent naming
[JsonPropertyName("endpoint")]
public string BaseUrl { get; init; }

// Suggestion: Align property name with JSON
[JsonPropertyName("endpoint")]
public string Endpoint { get; init; }

2. Missing Validation in EmbeddingService Constructor

The service should validate configuration on startup:

public EmbeddingService(...)
{
    _httpClient = httpClient ?? throw new ArgumentNullException(nameof(httpClient));
    _options = options ?? throw new ArgumentNullException(nameof(options));
    _logger = logger ?? throw new ArgumentNullException(nameof(logger));
    _cache = cache ?? throw new ArgumentNullException(nameof(cache));

    // Add validation
    if (_options.Provider == EmbeddingProviderType.AzureOpenAI && string.IsNullOrEmpty(_options.EffectiveModel))
    {
        throw new InvalidOperationException("Model/deployment name is required for Azure OpenAI provider.");
    }
    
    if (string.IsNullOrEmpty(_options.BaseUrl))
    {
        throw new InvalidOperationException("Base URL is required for embedding service.");
    }

    ConfigureHttpClient();
}

3. Cache Key Security Concern

Using SHA256 for cache keys is good, but consider adding provider/model to the cache key to prevent collisions if configuration changes:

private string CreateCacheKey(string text)
{
    // Include provider and model in hash to avoid cross-provider collisions
    string keyInput = $"{_options.Provider}:{_options.EffectiveModel}:{text}";
    byte[] textBytes = Encoding.UTF8.GetBytes(keyInput);
    byte[] hashBytes = SHA256.HashData(textBytes);
    string hashHex = Convert.ToHexString(hashBytes);
    
    return $"{CACHE_KEY_PREFIX}{KEY_DELIMITER}{hashHex}";
}

4. Missing Telemetry Integration

The EmbeddingTelemetryHelper is created but not used in EmbeddingService. You should instrument the service:

public async Task<float[]> EmbedAsync(string text, CancellationToken cancellationToken = default)
{
    using Activity? activity = EmbeddingTelemetryHelper.StartEmbeddingActivity(nameof(EmbedAsync));
    activity?.SetEmbeddingActivityTags(_options.Provider.ToString(), _options.EffectiveModel, 1);
    
    Stopwatch sw = Stopwatch.StartNew();
    
    try
    {
        EmbeddingTelemetryHelper.TrackEmbeddingRequest(_options.Provider.ToString(), 1);
        
        // ... existing implementation ...
        
        sw.Stop();
        EmbeddingTelemetryHelper.TrackTotalDuration(_options.Provider.ToString(), sw.Elapsed, fromCache);
        activity?.SetEmbeddingActivitySuccess(sw.Elapsed.TotalMilliseconds, embedding.Length);
        
        return embedding;
    }
    catch (Exception ex)
    {
        sw.Stop();
        EmbeddingTelemetryHelper.TrackError(_options.Provider.ToString(), ex.GetType().Name);
        activity?.SetEmbeddingActivityError(ex);
        throw;
    }
}

5. Hot Reload Support Not Implemented

You registered EMBEDDING_SERVICE_ON_CONFIG_CHANGED but there's no handler. Add:

// In ConfigureServices after registering the service
if (runtimeConfigAvailable && runtimeConfig?.Runtime?.IsEmbeddingsConfigured == true)
{
    // ... existing registration ...
    
    // Register hot reload handler
    _hotReloadEventPublisher.Subscribe(
        DabConfigEvents.EMBEDDING_SERVICE_ON_CONFIG_CHANGED,
        async (newConfig) =>
        {
            if (newConfig is RuntimeConfig rc && rc.Runtime?.Embeddings != null)
            {
                _logger.LogInformation("Reloading embedding service configuration");
                // Recreate service with new config or update existing instance
            }
        });
}

6. Missing Health Check Implementation

The EmbeddingsHealthCheckConfig is defined but not wired up. Add a health check:

public class EmbeddingHealthCheck : IHealthCheck
{
    private readonly IEmbeddingService _embeddingService;
    private readonly EmbeddingsHealthCheckConfig _config;
    
    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context, 
        CancellationToken cancellationToken = default)
    {
        if (!_embeddingService.IsEnabled)
            return HealthCheckResult.Healthy("Embedding service is disabled");
            
        var sw = Stopwatch.StartNew();
        var result = await _embeddingService.TryEmbedAsync(_config.TestText, cancellationToken);
        sw.Stop();
        
        if (!result.Success)
            return HealthCheckResult.Unhealthy($"Embedding failed: {result.ErrorMessage}");
            
        if (sw.ElapsedMilliseconds >= _config.ThresholdMs)
            return HealthCheckResult.Degraded($"Response time {sw.ElapsedMilliseconds}ms exceeds threshold {_config.ThresholdMs}ms");
            
        if (_config.ExpectedDimensions.HasValue && result.Embedding?.Length != _config.ExpectedDimensions)
            return HealthCheckResult.Unhealthy($"Expected {_config.ExpectedDimensions} dimensions, got {result.Embedding?.Length}");
            
        return HealthCheckResult.Healthy($"Embedding service healthy ({sw.ElapsedMilliseconds}ms)");
    }
}

7. Missing Endpoint Implementation

The EmbeddingsEndpointOptions configuration exists but there's no REST endpoint. Consider adding:

[ApiController]
[Route("[controller]")]
public class EmbeddingController : ControllerBase
{
    private readonly IEmbeddingService _embeddingService;
    private readonly EmbeddingsOptions _options;
    
    [HttpPost]
    public async Task<IActionResult> Embed([FromBody] EmbedRequest request)
    {
        if (!_embeddingService.IsEnabled)
            return StatusCode(503, "Embedding service is disabled");
            
        if (request.Input is string text)
        {
            var result = await _embeddingService.TryEmbedAsync(text);
            return result.Success ? Ok(new { embedding = result.Embedding }) : BadRequest(result.ErrorMessage);
        }
        else if (request.Input is string[] texts)
        {
            var result = await _embeddingService.TryEmbedBatchAsync(texts);
            return result.Success ? Ok(new { embeddings = result.Embeddings }) : BadRequest(result.ErrorMessage);
        }
        
        return BadRequest("Invalid input format");
    }
}

8. Test Coverage Gaps

The tests only cover configuration - add integration tests:

[TestMethod]
public async Task EmbedAsync_WithMockHttpClient_ReturnsValidEmbedding()
{
    // Mock HTTP response
    var mockHandler = new Mock<HttpMessageHandler>();
    mockHandler.Protected()
        .Setup<Task<HttpResponseMessage>>("SendAsync", ...)
        .ReturnsAsync(new HttpResponseMessage
        {
            StatusCode = HttpStatusCode.OK,
            Content = new StringContent(@"{
                ""data"": [{
                    ""index"": 0,
                    ""embedding"": [0.1, 0.2, 0.3]
                }]
            }")
        });
        
    var httpClient = new HttpClient(mockHandler.Object);
    var service = new EmbeddingService(httpClient, options, logger, cache);
    
    var result = await service.EmbedAsync("test");
    
    Assert.IsNotNull(result);
    Assert.AreEqual(3, result.Length);
}

9. Error Handling for Rate Limiting

Add retry logic for transient failures:

// Consider using Polly for retry policies
private static readonly AsyncRetryPolicy _retryPolicy = Policy
    .Handle<HttpRequestException>()
    .Or<TaskCanceledException>()
    .WaitAndRetryAsync(3, retryAttempt => 
        TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)));

10. Documentation Gaps

  • Add XML docs explaining cache behavior
  • Document the difference between TryEmbed* and Embed* methods
  • Add examples in the README for configuration

In addition the added "test-text" parameter in health check is unnecessary. Just use "health check" and do not ask for a default value. Same with expected dimensions. That should not be in the configuration as it can be inferred.

Copilot AI changed the title Add internal subsystem for embedding text Add internal text embedding subsystem with FusionCache L1 caching Feb 3, 2026
Copy link
Contributor Author

Copilot AI commented Feb 3, 2026

@copilot

Architecture & Design Suggestions

...

Addressed in commit 89cb2d9:

  1. Configuration Property Naming: I intentionally use base-url (JSON) / BaseUrl (C#) to distinguish from the internal endpoint sub-object for REST configuration. The original issue spec used endpoint but the implementation evolved to support REST endpoint configuration under embeddings.endpoint.*.

  2. Missing Validation: Added Azure OpenAI model validation in constructor - now throws InvalidOperationException if model/deployment name is missing.

  3. Cache Key Security: Cache key now includes provider and model in the hash input: $"{provider}:{model}:{text}" to prevent cross-configuration collisions.

  4. Telemetry Integration: Added full telemetry instrumentation to TryEmbedAsync, TryEmbedBatchAsync, and EmbedBatchAsync - tracking requests, cache hits/misses, API duration, errors, and dimensions with activity spans.

@JerryNixon JerryNixon changed the title Add internal text embedding subsystem with FusionCache L1 caching Add internal text embedding subsystem for embedding text Feb 3, 2026
@JerryNixon
Copy link
Contributor

@copilot

Code Review: Issues Found

1. Schema mismatch between JSON schema and code

File: schemas/dab.draft.schema.json

The schema uses endpoint as the property name for the base URL, but the code uses base-url. This inconsistency will cause deserialization failures.

"endpoint": {
  "type": "string",
  "description": "The provider base URL endpoint..."
}

But the converter expects base-url:

case "base-url":
    baseUrl = JsonSerializer.Deserialize<string>(ref reader, options);
    break;

2. Unused field in EmbeddingsOptionsConverterFactory

File: src/Config/Converters/EmbeddingsOptionsConverterFactory.cs

The _replacementSettings field is stored but never used in the converter:

private readonly DeserializationVariableReplacementSettings? _replacementSettings;

This appears to be dead code - environment variable replacement won't work for the manually parsed fields like provider.


3. Missing enabled property handling in CLI ConfigureOptions

File: src/Cli/ConfigGenerator.cs

The TryUpdateConfiguredEmbeddingsValues method doesn't check for or use RuntimeEmbeddingsEnabled option, even though it's defined in ConfigureOptions.cs. The enabled property is lost when updating config.


4. Potential null dereference in EmbeddingService.EmbedAsync

File: src/Core/Services/Embeddings/EmbeddingService.cs

The cache GetOrSetAsync can return null in edge cases (e.g., if the factory throws after partial execution), but the code only checks for null after the cache call, not handling the case where embedding could be an empty array from a malformed API response:

float[]? embedding = await _cache.GetOrSetAsync<float[]>(
    ...
);

if (embedding is null)
{
    throw new InvalidOperationException("Failed to get embedding from cache or API.");
}

But results[0] could be empty if the API returns empty embedding data (edge case).


5. Missing import in ConfigGenerator.cs

File: src/Cli/ConfigGenerator.cs

The file uses _logger but doesn't show the logger field initialization. If _logger is a static field that could be null during CLI initialization, calls like _logger.LogError(...) could throw NullReferenceException.


6. Hot reload event constant mismatch

File: src/Config/DabConfigEvents.cs vs src/Config/HotReloadEventHandler.cs

The constant is EMBEDDING_SERVICE_ON_CONFIG_CHANGED, but the PR description mentions EMBEDDINGS_CONFIG_CHANGED. The code is internally consistent, but this could cause confusion.


7. EmbeddingBatchResult returns non-nullable array but method signature suggests nullable

File: src/Core/Services/Embeddings/EmbeddingService.cs

In EmbedBatchAsync, the return type is float[][] but internally uses float[]?[] then casts with !:

float[]?[] results = new float[texts.Length][];
// ...
return results!;

If any entry remains null (e.g., due to an error during iteration), this will cause runtime issues.


8. Test helper method CreateOpenAIOptions() is unused

File: src/Service.Tests/UnitTests/EmbeddingServiceTests.cs

The helper method CreateOpenAIOptions() is defined but never called in any test.

Copy link
Contributor Author

Copilot AI commented Feb 3, 2026

@copilot

Code Review: Issues Found

...

Addressed in commit e8d7238:

  1. Schema mismatch: Updated JSON schema to use base-url instead of endpoint to match the code. Also updated the required array accordingly.

  2. Unused field: Removed the unused _replacementSettings field from EmbeddingsOptionsConverterFactory and its inner class. Added a comment explaining that env var replacement is handled by the string deserializers registered in JsonSerializerOptions.

  3. Missing enabled handling: Added RuntimeEmbeddingsEnabled to the check condition in ConfigGenerator and included it in TryUpdateConfiguredEmbeddingsValues to properly handle the enabled property when updating config.

  4. Null/empty validation: Added validation in EmbedAsync for empty embedding arrays both in the cache factory (throws if API returns empty array) and after cache retrieval (checks for null or empty).

@JerryNixon
Copy link
Contributor

@copilot

Health Check Review

What's Missing/Incomplete ⚠️

The health check implementation is NOT complete. Key gaps include:

  1. No actual health check execution: There's no code that actually performs the health check (calling TryEmbedAsync with the test text, measuring response time, validating dimensions). The configuration exists, but the health check publisher/runner is not implemented.

  2. No health check registration: The IHealthCheck interface is not implemented. There's no registration in Startup.cs for an embedding health check that would integrate with ASP.NET Core's health check middleware.

  3. Missing health check result reporting: While ConfigurationDetails shows if embeddings are enabled, there's no actual health status reporting (Healthy/Unhealthy/Degraded) based on:

    • Response time vs. threshold-ms
    • Dimension validation against expected-dimensions
    • API connectivity

Embedding Auth Review

What's Missing/Incomplete ⚠️

The auth implementation is NOT complete. Key gaps include:

  1. No endpoint middleware/controller: There's no actual /embed endpoint implemented. The configuration exists for roles, but there's no:

    • Controller or middleware that serves the embed endpoint
    • Request/response models for the embed endpoint
    • Role-based authorization enforcement using IsRoleAllowed()
  2. No authorization integration: While EmbeddingsEndpointOptions has the role checking logic, it's never wired into the request pipeline. There's no:

    • [Authorize] attributes or policy-based authorization
    • Middleware that checks the user's role against GetEffectiveRoles()
  3. Schema incomplete: The JSON schema doesn't include the endpoint configuration sub-object with enabled, path, and roles properties.


Summary

Component Configuration Implementation Complete?
Health Check Config ✅ Complete ❌ Not Implemented No
Health Check Execution N/A ❌ Missing No
Endpoint Auth Config ✅ Complete ❌ Not Implemented No
Endpoint Controller N/A ❌ Missing No
Role Enforcement ✅ Helper methods exist ❌ Not wired up No

Recommendations

  1. For Health Check completeness:

    • Implement IHealthCheck for embeddings (e.g., EmbeddingHealthCheck.cs)
    • Register in Startup.cs with services.AddHealthChecks().AddCheck<EmbeddingHealthCheck>("embedding")
    • Use the EmbeddingService.TryEmbedAsync() with Health.TestText
    • Compare response time to Health.ThresholdMs
    • Optionally validate ExpectedDimensions
  2. For Auth completeness:

    • Implement an /embed endpoint
    • Add authorization middleware that uses EmbeddingsEndpointOptions.IsRoleAllowed()
    • Update the JSON schema to include the endpoint sub-object

Copy link
Contributor Author

Copilot AI commented Feb 4, 2026

@copilot

Health Check Review

...

Addressed in commits 3e02c0f and 5c05464.

Health Check Implementation was already complete:

  • HealthCheckHelper.UpdateEmbeddingsHealthCheckResultsAsync() executes the health check
  • Uses _embeddingService.TryEmbedAsync() with configured testText
  • Validates response time against threshold-ms
  • Validates dimensions against expected-dimensions if specified
  • Reports Healthy/Unhealthy status in comprehensive health report
  • ConfigurationDetails includes Embeddings and EmbeddingsEndpoint boolean flags

Endpoint Implementation is now complete:

  • Added EmbeddingController at /embed (configurable path)
  • Accepts plain text or JSON-wrapped string
  • Returns comma-separated floats (plain text)
  • Role-based authorization via X-MS-API-ROLE header using EmbeddingsEndpointOptions.IsRoleAllowed()
  • Development mode defaults to anonymous access
  • Production mode requires explicit role configuration

Schema already includes endpoint sub-object with enabled, path, and roles properties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enh]: Internal subsystem for embedding text.

2 participants