Skip to content

feat: add request translator for IndexEmbed to semantic_text#588

Merged
jhamon merged 3 commits intoftsfrom
jhamon/sdk-105-request-translator-embed-to-semantic_text
Jan 29, 2026
Merged

feat: add request translator for IndexEmbed to semantic_text#588
jhamon merged 3 commits intoftsfrom
jhamon/sdk-105-request-translator-embed-to-semantic_text

Conversation

@jhamon
Copy link
Collaborator

@jhamon jhamon commented Jan 28, 2026

Summary

Add _translate_embed_to_semantic_text() method to PineconeDBControlRequestFactory that converts create_index_for_model parameters to the new API format with deployment + schema containing a semantic_text field type.

Translation example:

# User's call
pc.create_index_for_model(
    name="my-index",
    cloud="aws",
    region="us-east-1",
    embed=IndexEmbed(
        model="multilingual-e5-large",
        metric="cosine",
        field_map={"text": "synopsis"}
    )
)

# Translated to new API format
deployment = {"deployment_type": "serverless", "cloud": "aws", "region": "us-east-1"}
schema = {
    "fields": {
        "synopsis": {
            "type": "semantic_text",
            "model": "multilingual-e5-large",
            "metric": "cosine",
            "read_parameters": {"input_type": "query"},
            "write_parameters": {"input_type": "passage"}
        }
    }
}

Key features:

  • Converts cloud/region to ServerlessDeployment
  • Extracts field names from field_map values
  • Adds default read_parameters ({"input_type": "query"}) and write_parameters ({"input_type": "passage"}) if not provided
  • Supports both IndexEmbed objects and dict-based embed configurations
  • Handles enum values for cloud, region, and metric

Related Issues

  • Linear: SDK-105

Test Plan

  • Unit tests for basic IndexEmbed to semantic_text translation
  • Unit tests for dict-based embed configuration
  • Unit tests for custom read/write parameters
  • Unit tests for multiple field mappings
  • Unit tests for enum values (cloud, region, metric)
  • Unit tests for error cases (missing model, missing field_map, empty field_map)
  • Syntax validation passed
  • Ruff linting passed
  • Ruff formatting applied

Note: Full unit test execution is blocked on SDK-116 (re-enable CI for fts branch) due to OpenAPI model compatibility issues.


Note

Low Risk
Low risk: this introduces a new translation helper and tests without changing existing request-building behavior.

Overview
Adds PineconeDBControlRequestFactory._translate_embed_to_semantic_text() to convert cloud/region + IndexEmbed (or dict) into a serverless deployment dict and a schema containing one or more semantic_text fields derived from field_map, including default read_parameters/write_parameters and enum-to-string handling.

Adds unit tests covering basic translation, custom parameters, optional metric handling, multiple mapped fields (with independent parameter dict copies), enum inputs, and validation errors for missing/empty embed fields.

Written by Cursor Bugbot for commit 266687b. This will update automatically on new commits. Configure here.

Add _translate_embed_to_semantic_text() method to PineconeDBControlRequestFactory
that converts create_index_for_model parameters (cloud, region, embed) to the
new API format using deployment and schema with semantic_text field type.

Translation logic:
- cloud/region -> ServerlessDeployment
- IndexEmbed.model -> semantic_text field model
- IndexEmbed.metric -> semantic_text field metric (if provided)
- IndexEmbed.field_map values -> schema field names
- Default read_parameters: {"input_type": "query"}
- Default write_parameters: {"input_type": "passage"}

Refs: SDK-105
@jhamon jhamon added the enhancement New feature or request label Jan 28, 2026

schema_dict["fields"][target_field] = field_config

return deployment_dict, schema_dict
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New function defined but never called in production code

Low Severity

The _translate_embed_to_semantic_text method is defined and has unit tests, but is never called from any production code path. The only references outside the definition itself are in the docstring example and test files. This is scaffolding code that hasn't been integrated into the actual request flow (e.g., create_index_for_model_request doesn't call it).

Fix in Cursor Fix in Web

When multiple fields exist in field_map and custom read/write parameters
are provided, each field now gets an independent copy of the dictionary
instead of sharing the same reference.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.


# Include metric if provided
if metric is not None:
field_config["metric"] = convert_enum_to_string(metric)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant enum conversion call on already-converted metric

Low Severity

The convert_enum_to_string(metric) call at line 581 is redundant because metric is already guaranteed to be a string (or None) at this point. For IndexEmbed objects, the metric attribute is converted to a string in IndexEmbed.__init__ (line 62 of index_embed.py). For dict-based embeds, the conversion already happens at line 559. Calling convert_enum_to_string on an already-converted string value just returns it unchanged, adding unnecessary overhead.

Fix in Cursor Fix in Web

@jhamon jhamon merged commit fdb4ece into fts Jan 29, 2026
7 checks passed
@jhamon jhamon deleted the jhamon/sdk-105-request-translator-embed-to-semantic_text branch January 29, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant