feat(azure-search): Add Azure AI Search instrumentation#3667
feat(azure-search): Add Azure AI Search instrumentation#3667maheshbabugorantla wants to merge 41 commits intotraceloop:mainfrom
Conversation
Add 23 semantic convention attributes for Azure AI Search instrumentation following OpenTelemetry GenAI specification patterns: Core Attributes: - azure_search.index_name, azure_search.search.text - azure_search.search.top, azure_search.search.skip - azure_search.search.filter, azure_search.search.query_type - azure_search.document.count, azure_search.document.key - azure_search.suggester_name, azure_search.analyzer_name Indexer Pipeline Attributes: - azure_search.indexer_name, azure_search.indexer.status - azure_search.indexer.documents_processed/failed - azure_search.data_source_name, azure_search.data_source.type - azure_search.skillset_name, azure_search.skillset.skill_count Response Attributes: - azure_search.search.results_count - azure_search.document.succeeded_count/failed_count - azure_search.autocomplete.results_count - azure_search.suggest.results_count
Implement AzureSearchInstrumentor with comprehensive method coverage: SearchClient (10 methods): - search, get_document, get_document_count - upload_documents, merge_documents, delete_documents - merge_or_upload_documents, index_documents - autocomplete, suggest SearchIndexClient (7 methods): - create_index, create_or_update_index, delete_index - get_index, list_indexes, get_index_statistics, analyze_text SearchIndexerClient (18 methods): - Indexer management: create/update/delete/get/run/reset indexers - Data source connections: create/update/delete/get operations - Skillset management: create/update/delete/get operations Features: - Full sync and async support (67 total instrumented methods) - Response attribute capturing for search results and batch operations - Span attributes for vector/hybrid/semantic search parameters - @dont_throw decorator for graceful error handling
Add 51 tests covering all instrumentation functionality: Unit Tests (33 tests): - SearchClient span creation and attributes - SearchIndexClient index management operations - SearchIndexerClient pipeline operations - Instrumentation lifecycle (instrument/uninstrument) - Response attribute extraction Integration Tests (18 tests): - VCR cassette-based API call recording/replay - Real SDK behavior verification Test Configuration: - VCR config with header filtering for api-key/authorization - allow_playback_repeats for search iterator support - InMemorySpanExporter for span verification
Configure Poetry package for opentelemetry-instrumentation-azure-search: Dependencies: - python >=3.9,<4 - opentelemetry-api ^1.38.0 - opentelemetry-instrumentation >=0.59b0 - opentelemetry-semantic-conventions-ai ^0.49.6 - azure-search-documents >=11.0.0 Dev Dependencies: - pytest, pytest-vcr for testing - vcrpy >=7.0.0 (fixed urllib3 compatibility) Configuration: - .python-version: 3.9.5 - poetry.toml: in-project virtualenvs - OpenTelemetry instrumentor entry point registered
Add Azure AI Search as optional dependency in traceloop-sdk: Installation: - pip install 'traceloop-sdk[azure-search]' - pip install 'traceloop-sdk[all]' Usage: - Traceloop.init() with auto-instrumentation - Instruments.AZURE_SEARCH for explicit selection Fixes traceloop#2303
Align with repository-wide migration from Poetry to UV: Package Configuration: - Update pyproject.toml to use [project] format with hatchling - Add [tool.uv.sources] for local semconv-ai dependency - Update .python-version to 3.10 - Replace poetry.lock with uv.lock Nx Project Configuration: - Update project.json to use nx:run-commands executor - Replace poetry commands with uv commands - Update lint target to use ruff instead of flake8 Test Fixes: - Fix unused variable warnings from ruff linter - All 51 tests passing
… and async support - Add vector search attributes: vector_queries_count, vector_fields, k_nearest_neighbors, vector_exhaustive, vector_filter_mode - Add semantic search attributes: semantic_configuration_name, query_caption, query_answer - Add additional search attributes: search_mode, scoring_profile, select, search_fields - Fix async wrapping: separate _async_wrap that properly awaits coroutines - Add error handling: set StatusCode.ERROR with description on exceptions - Refactor _wrap into _sync_wrap/_async_wrap with shared helpers - Add 21 new tests covering all new functionality - Add 12 new semantic convention attributes
Add pytest-cov>=7.0.0 to the test dependency group and configure coverage settings in pyproject.toml to enable measuring test coverage for the azure_search instrumentation package.
Add TestSyncWrapDispatch class with tests exercising the full _sync_wrap dispatch pipeline for every method type: search, document CRUD, autocomplete, suggest, index management, indexer management, data source, and skillset operations. Each test verifies the correct span is created with expected attributes and that the wrapped function's return value passes through correctly.
…tions Add TestAttributeFunctionEdgeCases class testing boundary conditions for all attribute setter functions: positional args vs kwargs fallbacks, generator inputs, enum value extraction, missing/None values, non-standard types, and direct function invocations for search, document batch, index management, analyze_text, indexer, data source, and skillset operations.
Add TestUtilsAndLifecycle class covering the dont_throw decorator behavior (exception suppression, custom exception logger callbacks), the suppression key bypass path, _wrap function's sync/async delegation, and instrumentor uninstrument/reinstrument cycle.
Add TestRemainingCoverageGaps class targeting every uncovered branch: unconvertible documents (TypeError in list()), falsy fields/actions/ index/indexer/data_source/skillset args, non-string non-object index names, missing analyzer names, ImportError during _instrument, Exception during _uninstrument, missing class on module, and dont_throw with no exception_logger configured. 149 tests, 100% statement and branch coverage across all 5 source files.
…ibutes Add missing semantic convention attributes for Azure AI Search: - Synonym map operations: name, synonyms_count - Service statistics: document_count, index_count - Extended vector search: query_kind, weight, oversampling - Search parameters: facets, order_by
…r instrumentation Extend instrumentation to cover the full Azure AI Search SDK surface: - SearchIndexClient: synonym map CRUD (6 methods), get_service_statistics, list_index_names, get_synonym_map_names - SearchIndexerClient: get_indexer_names, get_data_source_connection_names, get_skillset_names - SearchIndexingBufferedSender: upload, merge, delete, merge_or_upload, index_documents, flush - Async variants for all new methods - Attribute extraction for synonym map name/count, service document/index counts, and response dispatch for new operations
…ONTENT Add should_send_content() utility gated by TRACELOOP_TRACE_CONTENT env var (default: enabled). Supports per-request override via OpenTelemetry context API with override_enable_content_tracing key. Truthy values: true, 1, yes, on.
… and service stats Add 1174 lines of new unit tests covering: - TestShouldSendContent: env var toggle, truthy/falsy values, context override - TestContentCapture: get_document, autocomplete, suggest, upload/merge/delete documents, vector embeddings, index_documents, content disabled/re-enabled - TestSynonymMap: create, get, list, update, delete with list-based synonyms - TestServiceStatistics: document_count, index_count response attributes - TestBufferedSender: method registration and db.system attribute - TestNameOnlyListingMethods: list_index_names, get_indexer/datasource/skillset names, dispatch handling
…lient, and content capture Extend integration test suite with 12 new tests and VCR cassettes: - TestSynonymMapIntegration (6 tests): create, get, list, update, delete, list names — with proper setup/teardown and list-based synonyms constructor - TestSearchIndexerClientIntegration (3 tests): get_indexer_names, get_data_source_connection_names, get_skillset_names — each creates resources before listing to ensure non-empty responses - TestSearchIndexClientIntegration (2 tests): get_service_statistics, list_index_names - TestSearchClientIntegration (1 test): content_disabled_no_content_attributes - 12 new VCR cassettes recorded against live Azure AI Search service
- Fix version.py to match pyproject.toml (0.49.6 -> 0.51.1) - Add PyPI badge to README matching other instrumentation packages - Remove stale traceloop-sdk[azure-search] extras reference since azure-search is a required dependency, not an optional extra
|
Generated with ❤️ by ellipsis.dev |
📝 WalkthroughWalkthroughAdds a new OpenTelemetry Azure AI Search instrumentation package with instrumentor, wrappers, utils, semantic conventions, extensive VCR-backed integration tests and fixtures, packaging/tooling and sample-app/SDK wiring, developer docs, and runtime warning utilities; includes content-capture gating and per-request overrides. Changes
Sequence DiagramsequenceDiagram
participant App as Application
participant Instr as AzureSearchInstrumentor
participant Wrapper as Instrumentation Wrapper
participant Tracer as OpenTelemetry Tracer
participant Client as Azure Search Client
participant Exporter as Span Exporter
App->>Instr: instrument()
Instr->>Wrapper: install wrappers (WRAPPED_METHODS)
App->>Client: call instrumented_method(...)
Client->>Wrapper: wrapped_method_call
Wrapper->>Tracer: start_as_current_span("azure_search.*")
Wrapper->>Wrapper: set request attributes
Wrapper->>Client: invoke original method
Client-->>Wrapper: response / exception
Wrapper->>Wrapper: set response attributes / status (maybe content)
Wrapper->>Tracer: end span
Tracer->>Exporter: export span
Wrapper-->>App: return result
Estimated Code Review Effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…BUTES_GUIDE - Replace poetry commands with uv (pytest, ruff check) - Update routing example to show _set_request_attributes() dispatcher pattern - Add notes for _set_response_attributes() and content capture dispatchers - Update document dates
…ExtractionEdgeCases
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/traceloop-sdk/README.md (1)
42-52:⚠️ Potential issue | 🟡 MinorQuick Start example uses deprecated OpenAI API (
openai.ChatCompletion.create).The
openai.ChatCompletion.createclass-based API was removed inopenai >= 1.0. Since the currentopenaiSDK is well past 1.0, this example will fail for most users.Suggested update
```python Traceloop.init(app_name="joke_generation_service") +from openai import OpenAI + +client = OpenAI() `@workflow`(name="joke_creation") def create_joke(): - completion = openai.ChatCompletion.create( - model="gpt-3.5-turbo", + completion = client.chat.completions.create( + model="gpt-4o-mini", messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}], ) - return completion.choices[0].message.content + return completion.choices[0].message.content</details> </blockquote></details> </blockquote></details>🤖 Fix all issues with AI agents
In `@packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md`: - Around line 623-641: The example in _set_analyze_text_attributes leaves analyzer_name uninitialized when analyze_request is falsy, causing a NameError; initialize analyzer_name to None at the start (before the if analyze_request block) or set it from kwargs first, then overwrite from analyze_request if present, ensuring subsequent checks and the enum handling use a defined variable (reference: function _set_analyze_text_attributes and symbol analyzer_name). - Around line 509-517: Replace all occurrences of "poetry run <command>" with "uv run <command>" in the documentation snippet (specifically the pytest and flake8 examples in SPAN_ATTRIBUTES_GUIDE.md around the pytest example and the later reference at line ~658); update the two example blocks so they use "uv run pytest tests/test_azure_search_integration.py --record-mode=all" and "uv run pytest ... --record-mode=none" and change any "poetry run flake8" mention to "uv run flake8" to comply with the project's package manager convention. In `@packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py`: - Around line 340-353: The helper _set_document_batch_attributes currently converts non-__len__ document iterables to a list to measure length (setting SpanAttributes.AZURE_SEARCH_DOCUMENT_COUNT), which can exhaust generators before the actual call in _sync_wrap/_set_request_attributes; update _set_document_batch_attributes to only set the document count when the documents object has a __len__ attribute and otherwise skip counting (do not call list() or otherwise consume the iterable) so generators/iterators are left intact for the downstream Azure SDK call. In `@packages/opentelemetry-instrumentation-azure-search/README.md`: - Around line 3-5: The README's badge image tag is missing alt text causing an accessibility lint (MD045); update the <img> element for the PyPI badge (the <img> tag shown in the diff) to include a descriptive alt attribute (e.g., alt="PyPI package: opentelemetry-instrumentation-azure-search") so screen readers can convey the badge purpose. In `@packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py`: - Around line 801-817: The TestSearchIndexerClientIntegration class uses INTEGRATION_TEST_INDEX (e.g., in calls with target_index_name=INTEGRATION_TEST_INDEX) but lacks the setup_test_index fixture that ensures the index exists; add a fixture named setup_test_index (matching the other test classes' implementation) or an autouse/class-scoped fixture that creates or validates the index before tests run so SearchIndexerClient operations don't fail when this class runs in isolation. In `@packages/traceloop-sdk/pyproject.toml`: - Line 31: Add a missing `[tool.uv.sources]` entry for the package name "opentelemetry-instrumentation-azure-search": locate the `[tool.uv.sources]` section and insert an entry mapping that package name to its local/editable source directory (the same pattern used for the other instrumentation deps), place it in alphabetical order with the other entries, and ensure the key exactly matches "opentelemetry-instrumentation-azure-search" so local editable resolution works during development. In `@packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py`: - Around line 600-605: The VERTEXAI and VOYAGEAI branches still use the old pattern; change them to assign the init result to the same `result` variable (e.g., `result = init_vertexai_instrumentor(should_enrich_metrics, base64_image_uploader)` and `result = init_voyageai_instrumentor()`), then set `instrument_set = True` only when `result` is truthy so the existing post-loop warning logic that checks `result` (used elsewhere) will run consistently for `init_vertexai_instrumentor` and `init_voyageai_instrumentor`.🧹 Nitpick comments (14)
packages/opentelemetry-instrumentation-azure-search/pyproject.toml (1)
30-46: Consider removingautopep8from dev dependencies since Ruff is already configured.Having both
autopep8andrufffor formatting/linting is redundant. Other instrumentation packages in this repo typically rely solely on Ruff. As per coding guidelines, Ruff is the designated linter.Also,
pytest-asynciomay need anasyncio_modeconfiguration (e.g.,autoorstrict) in a[tool.pytest.ini_options]section to avoid warnings or unexpected behavior with async tests.Proposed changes
[dependency-groups] dev = [ - "autopep8>=2.2.0,<3", "pytest-sugar==1.0.0", "pytest>=8.2.2,<9", "ruff>=0.4.0", ]And add:
[tool.pytest.ini_options] asyncio_mode = "auto"packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/utils.py (1)
23-45: Missing@functools.wraps(func)on the inner wrapper.Without
functools.wraps, the decorated functions lose their__name__,__doc__, and__module__attributes, which degrades debuggability and introspection. Thefunc.__name__reference in the logger still works (it captures from the closure), but any external code inspecting the decorated function will see"wrapper"instead of the original name.Proposed fix
+import functools import logging import os import tracebackdef dont_throw(func): """...""" logger = logging.getLogger(func.__module__) + `@functools.wraps`(func) def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except Exception as e: logger.debug( "OpenLLMetry failed to trace in %s, error: %s", func.__name__, traceback.format_exc(), ) if Config.exception_logger: Config.exception_logger(e) return wrapperpackages/traceloop-sdk/traceloop/sdk/utils/instrumentation_warnings.py (1)
46-73: The function only acts whentarget_library_installed=True, making the parameter a bit misleading.The function name
warn_missing_instrumentationand itstarget_library_installedparameter might confuse callers — it only logs when the library is installed (suggesting the user should add the instrumentation extra). Theif target_library_installed:guard on line 68 is the only code path that does anything. Consider whether the parameter should be checked by the caller instead, or rename for clarity.This is minor — the current usage in
tracing.pypassestarget_library_installed=Trueexplicitly, which works correctly.packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py (2)
36-103: Dispatch functions_set_request_attributesand_set_response_attributesare not guarded with@dont_throw.While the individual helper functions they call are protected, the dispatch functions themselves could theoretically throw (e.g., if
methodcomparison or delegation fails unexpectedly). More importantly, an exception in these dispatchers would propagate up to_sync_wrap/_async_wrap, set the span to ERROR status, and re-raise — effectively breaking the user's call because of instrumentation. Adding@dont_throwwould make the instrumentation fully transparent.Proposed fix
+@dont_throw def _set_request_attributes(span, method, instance, args, kwargs): """Set all pre-call span attributes based on the method being called."""+@dont_throw def _set_response_attributes(span, method, response, args, kwargs): """Set all post-call span attributes from the response."""Also applies to: 105-131
722-753: Content attributes for documents are unbounded — large batches could create oversized spans.Both
_set_document_batch_request_content_attributes(line 729) and_set_index_documents_request_content_attributes(line 748) iterate over all documents/actions, setting a span attribute per item. Similarly, the response content functions (lines 803, 830) iterate all results. For large batches (e.g., thousands of documents), this could produce spans with thousands of attributes, causing performance issues or exceeding backend limits.Consider adding a reasonable cap (e.g., first 100 items) consistent with how other instrumentation packages handle content capture.
packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/__init__.py (2)
724-727: Use lazy%-style formatting inlogger.debug.f-strings are eagerly evaluated even when the log level is above DEBUG. Use
%-formatting for log calls to defer string construction.Proposed fix
- logger.debug(f"Could not wrap {module}.{wrap_object}.{wrap_method}") + logger.debug( + "Could not wrap %s.%s.%s", module, wrap_object, wrap_method + )
20-693: Consider generating sync/async method lists programmatically to reduce ~600 lines of duplication.Each async list is an exact mirror of its sync counterpart with
.aioappended to the module path. A helper function could derive one from the other, eliminating the risk of the two drifting out of sync.Example approach
def _make_async_methods(sync_methods): return [ {**m, "module": m["module"] + ".aio"} if ".aio" not in m["module"] else m for m in sync_methods ] # Replace the explicit async lists: ASYNC_SEARCH_CLIENT_METHODS = _make_async_methods(SEARCH_CLIENT_METHODS) ASYNC_SEARCH_INDEX_CLIENT_METHODS = _make_async_methods(SEARCH_INDEX_CLIENT_METHODS) ASYNC_SEARCH_INDEXER_CLIENT_METHODS = _make_async_methods(SEARCH_INDEXER_CLIENT_METHODS) ASYNC_BUFFERED_SENDER_METHODS = _make_async_methods(BUFFERED_SENDER_METHODS)packages/opentelemetry-instrumentation-azure-search/tests/cassettes/test_azure_search_integration/TestSearchIndexClientIntegration.test_get_service_statistics.yaml (1)
1-80: Two identical interactions recorded — verify this is intentional.The cassette contains two identical GET requests to the
servicestatsendpoint. If the test only callsget_service_statisticsonce, the second interaction is unused and could be trimmed. If the test calls it twice (e.g., to verify idempotency or span creation), this is fine.packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md (1)
21-21: Add a language specifier to the fenced code block.Flagged by markdownlint (MD040). Since this block represents a directory tree, use a
textorplaintextlanguage identifier.-``` +```text packages/opentelemetry-instrumentation-azure-search/packages/opentelemetry-instrumentation-azure-search/tests/conftest.py (2)
15-16: Hardcoded Azure Search endpoint leaks infrastructure naming.The fallback
"https://traceloop-otel-os.search.windows.net"exposes the actual Azure resource name used for recording. Consider using a more obviously-fake placeholder (e.g.,https://placeholder.search.windows.net) and adjusting VCR cassette host matching accordingly, or document that this must match the cassette recordings.
52-54: Redundant exporter clearing with per-class fixtures in integration tests.This
clear_exporterfixture isautouse=Trueat function scope, so it already runs before every test. Each test class intest_azure_search_integration.pyalso defines its ownclear_exporter_before_testfixture that does the same thing. The duplication is harmless but unnecessary — consider removing the per-class fixtures from the integration test file, or vice versa.packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py (3)
45-78: Duplicatedsetup_test_indexfixture across test classes.
TestSearchClientIntegration.setup_test_index(lines 45–78) andTestSearchIndexClientIntegration.setup_test_index(lines 386–419) are identical. Consider extracting this into a shared fixture inconftest.py(scoped tosessionormodule) to reduce duplication and ensure the index is set up once for all classes.Also applies to: 386-419
148-149: Moveimport timeto module level.
import timeappears inside three different test methods. While this works, placing it at the module level with the other imports is cleaner.Also applies to: 288-289, 325-326
825-833: Duplicated placeholder connection string.The same Azure Storage placeholder connection string appears twice (lines 828–831 and 876–879). Extract it to a module-level constant for maintainability.
Proposed fix
Add near the top of the file:
PLACEHOLDER_STORAGE_CONNECTION_STRING = ( "DefaultEndpointsProtocol=https;AccountName=placeholder;" "AccountKey=placeholder;EndpointSuffix=core.windows.net" )Then reference it in the fixtures:
connection_string=os.environ.get( "AZURE_STORAGE_CONNECTION_STRING", - "DefaultEndpointsProtocol=https;AccountName=placeholder;AccountKey=placeholder;EndpointSuffix=core.windows.net", + PLACEHOLDER_STORAGE_CONNECTION_STRING, ),Also applies to: 873-881
packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md
Outdated
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md
Show resolved
Hide resolved
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Outdated
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py
Show resolved
Hide resolved
| elif instrument == Instruments.VERTEXAI: | ||
| if init_vertexai_instrumentor(should_enrich_metrics, base64_image_uploader): | ||
| instrument_set = True | ||
| elif instrument == Instruments.VOYAGEAI: | ||
| if init_voyageai_instrumentor(): | ||
| instrument_set = True |
There was a problem hiding this comment.
VERTEXAI and VOYAGEAI branches not converted to the new result pattern.
These two branches still use the old if init_*(): instrument_set = True pattern instead of result = init_*(), which means they bypass the post-loop warning logic on lines 622-626. If the instrumentor fails but the target library is installed, no missing-instrumentation warning will be emitted for these two.
Proposed fix
elif instrument == Instruments.VERTEXAI:
- if init_vertexai_instrumentor(should_enrich_metrics, base64_image_uploader):
- instrument_set = True
+ result = init_vertexai_instrumentor(should_enrich_metrics, base64_image_uploader)
elif instrument == Instruments.VOYAGEAI:
- if init_voyageai_instrumentor():
- instrument_set = True
+ result = init_voyageai_instrumentor()🤖 Prompt for AI Agents
In `@packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py` around lines 600 -
605, The VERTEXAI and VOYAGEAI branches still use the old pattern; change them
to assign the init result to the same `result` variable (e.g., `result =
init_vertexai_instrumentor(should_enrich_metrics, base64_image_uploader)` and
`result = init_voyageai_instrumentor()`), then set `instrument_set = True` only
when `result` is truthy so the existing post-loop warning logic that checks
`result` (used elsewhere) will run consistently for `init_vertexai_instrumentor`
and `init_voyageai_instrumentor`.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md`:
- Around line 21-34: The fenced code block that lists the package tree (starting
with "packages/opentelemetry-instrumentation-azure-search/") is missing a
language identifier; update the Markdown fenced block to include a language
(e.g., add "text" after the opening ```), so the block becomes ```text ... ```
to satisfy markdownlint and improve rendering.
packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md
Outdated
Show resolved
Hide resolved
…atch size Previously, _set_document_batch_attributes() called list() on documents without __len__, which would exhaust generators/iterators before the actual Azure SDK call received them. Now we simply skip the count for non-sized iterables, preserving the user's data intact.
…tegration tests Ensures the target index exists before indexer tests run, matching the pattern used by SearchClient and SearchIndexClient test classes.
- Add alt text to PyPI badge image (MD045) - Add language identifier to fenced code block (MD040) - Initialize analyzer_name before conditional in example to prevent NameError
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In
`@packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py`:
- Around line 506-510: The synonyms count logic incorrectly treats an empty
string as one item; in the block that sets synonyms_count (the branch handling
"if synonyms:" and the elif for isinstance(synonyms, str)), change the string
path to first compute s = synonyms.strip(), then if not s set synonyms_count = 0
else set synonyms_count = len(s.split("\n")). Update the code that assigns
synonyms_count for string inputs accordingly (refer to the synonyms variable and
synonyms_count assignment in wrapper.py) so empty/whitespace-only synonym
strings yield a count of 0.
- Around line 449-468: The code in _set_data_source_attributes assigns
data_source_type from getattr(data_source, "type", None) but may receive a
SearchIndexerDataSourceType enum; convert it to a primitive string before
passing to _set_span_attribute. Update _set_data_source_attributes so after
obtaining data_source_type it normalizes enums (e.g., use data_source_type.value
if present, otherwise str(data_source_type) or None) and then call
_set_span_attribute(span, SpanAttributes.AZURE_SEARCH_DATA_SOURCE_TYPE,
data_source_type) with that normalized string.
In
`@packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py`:
- Around line 351-372: The test test_content_disabled_no_content_attributes
currently calls search_client.get_document_count(), which never emits content
attributes, so replace that call with an operation that normally produces
content attributes (for example call search_client.search(...) or
search_client.get_document(...) or search_client.upload_documents(...)) while
keeping the monkeypatch of TRACELOOP_TRACE_CONTENT="false"; this ensures the
test exercises the content suppression gate and then assert on spans as before.
Use the existing symbols test_content_disabled_no_content_attributes,
TRACELOOP_TRACE_CONTENT, and one of search_client.search,
search_client.get_document, or search_client.upload_documents to implement the
change.
- Around line 858-873: The setup currently calls create_data_source_connection
and then create_indexer (creating ds_connection and indexer) but if
create_indexer fails the data source is never removed; wrap the resource setup
and test assertions in a guaranteed teardown block: after creating ds_connection
and indexer (or attempting to), ensure cleanup of the
SearchIndexerDataSourceConnection and SearchIndexer happens in a finally (or by
registering addCleanup/addfinalizer) so deletion of ds_connection and indexer
always runs regardless of failures in create_indexer or assertions.
🧹 Nitpick comments (3)
packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py (2)
714-726: Unbounded per-document span attributes may hit OTel limits or degrade performance.The content-capture loops (here and in
_set_index_documents_request_content_attributes,_set_document_batch_response_content_attributes,_set_index_documents_response_content_attributes,_set_autocomplete_content_attributes,_set_suggest_content_attributes) iterate over the entire collection without a cap. The default OTel SDKSpanLimits.max_number_of_attributesis 128 — attributes beyond that are silently dropped. For large batches (thousands of documents), this generates many JSON strings that are never recorded, wasting CPU and memory on the hot path.Consider adding a configurable cap (e.g., first 50 items) and recording the total count separately so users know data was truncated.
Illustrative cap pattern
+MAX_CONTENT_ITEMS = 50 # configurable cap for per-item span attributes + `@dont_throw` def _set_document_batch_request_content_attributes(span, args, kwargs): """Set indexed db.query.result.N.document attributes for each input document.""" documents = kwargs.get("documents") or (args[0] if args else None) if not documents: return - for i, doc in enumerate(documents): + for i, doc in enumerate(documents): + if i >= MAX_CONTENT_ITEMS: + break _set_span_attribute( span, f"{EventAttributes.DB_QUERY_RESULT_DOCUMENT.value}.{i}", _safe_json_dumps(doc), )
789-839: Duplicated response content extraction logic.
_set_document_batch_response_content_attributesand_set_index_documents_response_content_attributesshare identical loop bodies (extracting key, succeeded, status_code, error_message and setting indexed attributes). The same applies to their non-content counterparts at Lines 547–590. A shared helper accepting the results list would reduce ~40 lines of duplication.packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py (1)
37-101: Consider extracting duplicated fixtures intoconftest.pyor a base class.
index_client_setup,setup_test_index,clear_exporter_before_test, andindex_clientare duplicated nearly verbatim across all four test classes. Moving them to session- or module-scoped fixtures inconftest.pywould reduce ~120 lines of boilerplate and make it easier to keep the setup logic consistent.
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Show resolved
Hide resolved
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Outdated
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py
Outdated
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-azure-search/tests/test_azure_search_integration.py
Outdated
Show resolved
Hide resolved
…ring - Convert SearchIndexerDataSourceType enum to string before setting span attribute, matching the pattern used for query_type and vector_filter_mode enums - Fix edge case where empty synonyms string incorrectly reports count of 1 instead of 0
…and content gate - Use get_document() instead of get_document_count() in content-disabled test so it actually exercises the content suppression gate - Wrap indexer client test assertions in try/finally to guarantee resource cleanup even when assertions fail - Update VCR cassette for content-disabled test
…ead of IndexDocumentsResult The SDK's index_documents() returns a plain list of IndexingResult, not an IndexDocumentsResult object. The response handler was checking response.results which returned None, silently skipping succeeded/failed count attributes.
…ertions - Add helper functions (_get_only_span, _assert_base_span, _span_attrs) to eliminate duplication and enforce consistent assertions - Assert StatusCode.OK on every span via _assert_base_span - Verify response attribute values (succeeded_count, failed_count, results_count) against actual VCR cassette data - Parse and verify content capture attribute values via json.loads() instead of existence-only checks - Remove redundant per-class clear_exporter fixtures (conftest handles it) - Remove unnecessary time.sleep() calls (no-op during VCR playback) - Remove try/except around SDK calls (VCR playback is deterministic) - Use try/finally for guaranteed cleanup in create/delete test pairs - Add factory functions for client construction
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In
`@packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py`:
- Around line 412-428: The span attribute helpers
(_set_indexer_management_attributes, _set_skillset_attributes, and
_set_data_source_attributes) currently pass through the raw first-arg for
non-create operations which may be an object; update their non-create branch to
mirror the synonym-map/name resolution logic used elsewhere (i.e., if the
resolved value is not a string, attempt to extract a .name attribute from the
object) so that when callers pass an indexer/skillset/data_source object you set
the AZURE_SEARCH_*_NAME attribute to the object's name rather than the object
itself.
- Around line 730-735: The loop that sets one span attribute per document (see
the enumerate(documents) usage in the response helpers) must be limited to avoid
unbounded attributes; define a constant max (e.g., MAX_SPAN_ATTR_ITEMS = 100)
and only iterate up to that cap when setting attributes in functions like
_set_document_batch_request_content_attributes,
_set_index_documents_request_content_attributes,
_set_autocomplete_content_attributes, _set_suggest_content_attributes, and the
response helpers (the enumerate(documents) loop shown); if documents were
truncated, set an additional span attribute indicating truncation (e.g.,
"<EventAttributes>.<type>.truncated_count") and emit a debug/warning log when
truncation occurs so operators can see when batches exceeded the cap.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py (1)
575-599:_set_index_documents_response_attributesduplicates_set_document_batch_response_attributes.The succeeded/failed counting logic (Lines 575-599) is nearly identical to Lines 553-572. The only difference is the initial
responseunwrapping (list vs object with.results). Same applies to the content variants at Lines 798-852 vs 824-852. A small shared helper would reduce this duplication.Sketch: shared helper
def _count_indexing_results(results): """Count succeeded/failed from a list of IndexingResult.""" succeeded = sum(1 for r in results if getattr(r, "succeeded", False)) return succeeded, len(results) - succeeded def _extract_results_list(response): """Normalize response to a list of IndexingResult.""" if isinstance(response, list): return response results = getattr(response, "results", None) return results if isinstance(results, list) else None
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Show resolved
Hide resolved
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Show resolved
Hide resolved
…gging validation Add 14 workflow tests (7 sync + 7 async) that validate traces tell a debuggable story for 2am production troubleshooting: - search_pipeline: upload→search trace correlation with content attrs - document_lifecycle: 5-step CRUD audit trail with mutation tracking - typeahead_pipeline: autocomplete+suggest results counts and content - bulk_ingestion_partial_failure: per-document metadata with error details - index_management_pipeline: 5-span deployment pipeline correlation - content_privacy_across_pipeline: zero content attrs when disabled - error_then_retry_success: StatusCode.ERROR + OK in same trace
- Add object-to-name resolution for non-create indexer, data source, and skillset operations (mirrors synonym map pattern) - Add configurable content item cap via TRACELOOP_TRACE_CONTENT_MAX_ITEMS env var (default: 100) to avoid exceeding OTel SpanLimits on large batches - Refactor duplicated indexing result content logic into shared helper - Fix doc example in SPAN_ATTRIBUTES_GUIDE.md to not consume generators
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In
`@packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md`:
- Around line 127-138: The example currently consumes generators by calling
list(documents) in the else-branch (variable documents), which can exhaust
iterators; remove that else-block or replace the logic to only count when
__len__ exists. Update the snippet around documents to follow the safe pattern
used later: if documents and hasattr(documents, "__len__") then set count and
call _set_span_attribute(span, ATTR_DOCUMENT_COUNT, count); otherwise skip
counting and do not convert the iterable to a list.
- Around line 432-445: The "Good" example currently consumes iterators by
calling len(list(documents)); change it to "✅ Good - safe type handling that
doesn't consume iterators" and only use len(documents) when hasattr(documents,
'__len__') — otherwise skip counting (set count = None) to avoid exhausting
generators; also update the "❌ Bad" example to show the consuming pattern (e.g.,
converting documents to list and then len) and label it as "❌ Bad - consumes
generators/iterators" so readers don't follow the destructive pattern.
🧹 Nitpick comments (5)
packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/utils.py (1)
42-64:dont_throwshould preserve the wrapped function's metadata withfunctools.wraps.Without
@functools.wraps(func), all functions decorated with@dont_throw(dozens inwrapper.py) lose their__name__,__doc__, and__module__. This complicates debugging and breaks any introspection that relies on function identity.Proposed fix
import logging import os import traceback +import functools from opentelemetry import context as context_api from opentelemetry.instrumentation.azure_search.config import Config @@ -49,6 +50,7 @@ # Obtain a logger specific to the function's module logger = logging.getLogger(func.__module__) + `@functools.wraps`(func) def wrapper(*args, **kwargs): try: return func(*args, **kwargs)packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py (4)
17-21:_set_span_attributeguard is fine but the trailingreturnis unnecessary.Nit: the bare
returnon line 21 has no effect since the function would returnNoneanyway. Not blocking.
36-103: Dispatcher functions lack@dont_throwprotection.
_set_request_attributesand_set_response_attributesare the only attribute-setting functions not wrapped with@dont_throw. While the individual helpers they call are protected, an unexpected exception in the dispatcher itself (e.g., ifmethodwere everNoneand a future refactor introduced.startswith()) would propagate uncaught into_sync_wrap/_async_wrapand crash the instrumented call.The risk is low today since the bodies are just string equality checks, but adding
@dont_throwhere would be consistent with the rest of the module and future-proof.Proposed fix
+@dont_throw def _set_request_attributes(span, method, instance, args, kwargs): """Set all pre-call span attributes based on the method being called."""+@dont_throw def _set_response_attributes(span, method, response, args, kwargs): """Set all post-call span attributes from the response."""Also applies to: 105-131
210-264:topis fetched fromkwargstwice.Line 217 sets
AZURE_SEARCH_SEARCH_TOPfromkwargs.get("top"), then lines 228-230 fetchkwargs.get("top")again to setVECTOR_DB_QUERY_TOP_K. Consider reusing the local variable.Proposed fix
- _set_span_attribute(span, SpanAttributes.AZURE_SEARCH_SEARCH_TOP, kwargs.get("top")) + top = kwargs.get("top") + _set_span_attribute(span, SpanAttributes.AZURE_SEARCH_SEARCH_TOP, top) _set_span_attribute(span, SpanAttributes.AZURE_SEARCH_SEARCH_SKIP, kwargs.get("skip")) _set_span_attribute(span, SpanAttributes.AZURE_SEARCH_SEARCH_FILTER, kwargs.get("filter")) @@ -225,9 +226,7 @@ _set_span_attribute(span, SpanAttributes.AZURE_SEARCH_SEARCH_QUERY_TYPE, qt_str) # Set top_k for vector DB convention - top = kwargs.get("top") if top: _set_span_attribute(span, SpanAttributes.VECTOR_DB_QUERY_TOP_K, top)
713-767:max_content_items()is called on every loop iteration, re-reading the env var each time.Each of these content-capture loops (lines 721, 740, 761, 787, 806, 818) calls
max_content_items()per iteration, which readsos.getenveach time. Consider hoisting the call before the loop.Example for one function (apply similarly to others)
`@dont_throw` def _set_search_vector_embeddings_attributes(span, kwargs): """Set indexed db.search.embeddings.N.vector attributes for vector queries.""" vector_queries = kwargs.get("vector_queries") if not vector_queries: return + cap = max_content_items() for i, vq in enumerate(vector_queries): - if i >= max_content_items(): + if i >= cap: break
packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-azure-search/docs/SPAN_ATTRIBUTES_GUIDE.md
Show resolved
Hide resolved
AsyncSearchItemPaged.get_count() is a coroutine function, not a sync method. Calling it synchronously returned a coroutine object instead of the count, producing "Invalid type coroutine for attribute" warnings. Split into sync _set_search_response_attributes (skips coroutine functions) and async _set_search_response_attributes_async (awaits get_count). The async variant is called from _async_wrap for search.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In
`@packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py`:
- Line 242: The span attribute for search mode is being set with the enum object
directly; update the call that uses _set_span_attribute with
SpanAttributes.AZURE_SEARCH_SEARCH_MODE to pass the enum's underlying string
(e.g., use kwargs.get("search_mode").value when present, or None if missing)
similar to how query_type, vector_filter_mode, query_caption, and query_answer
are handled so the recorded span contains a primitive string instead of an enum
instance.
- Around line 560-578: The async helper _set_search_response_attributes_async
must keep its manual try/except because the current dont_throw decorator in
utils.py is not async-aware; update dont_throw to detect
asyncio.iscoroutinefunction and return an async wrapper that awaits the inner
call and catches exceptions (mirroring the anthropic implementation) so async
functions like _set_search_response_attributes_async can safely use `@dont_throw`
without losing await/exception handling; ensure the updated dont_throw preserves
the existing synchronous behavior, retains logging of exceptions, and reference
the decorator name dont_throw and the function
_set_search_response_attributes_async when making the change.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py (2)
36-38: Consider adding@dont_throwto the dispatch functions for defensive consistency.
_set_request_attributesand_set_response_attributesare the only undecorated functions in the call chain between the user's wrapped call and the span attribute helpers. While their bodies are safe if/elif chains over string comparisons (calling@dont_throw-decorated helpers), decorating them would guard against any future refactoring that introduces riskier logic. This is purely defensive.Also applies to: 105-108
751-753:max_content_items()is re-evaluated on every loop iteration.Each call reads and parses an environment variable. Consider caching the result once before each loop:
Example fix (apply to all content loops)
def _set_search_vector_embeddings_attributes(span, kwargs): vector_queries = kwargs.get("vector_queries") if not vector_queries: return + limit = max_content_items() for i, vq in enumerate(vector_queries): - if i >= max_content_items(): + if i >= limit: breakAlso applies to: 770-772, 791-793, 817-819, 836-838, 848-850
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Outdated
Show resolved
Hide resolved
...telemetry-instrumentation-azure-search/opentelemetry/instrumentation/azure_search/wrapper.py
Outdated
Show resolved
Hide resolved
…ES_GUIDE Replace all doc examples that consume generators with list() to use the safe hasattr(__len__) pattern instead, matching the actual production code behavior.
… enum - Update dont_throw decorator to detect async functions and return an async wrapper that awaits the inner call (mirrors Anthropic pattern) - Apply @dont_throw to _set_search_response_attributes_async, removing the manual try/except - Convert search_mode enum to string before setting span attribute, consistent with query_type, vector_filter_mode, etc.
Avoid calling max_content_items() on every loop iteration by hoisting the result into a local variable before the loop.
…e consistency Decorate _set_request_attributes and _set_response_attributes with @dont_throw so the entire attribute-setting call chain is guarded against unexpected exceptions.
- Replace method in [...] list checks with module-level frozensets for O(1) dispatch - Hoist should_send_content() into _sync_wrap/_async_wrap (1 call per span, not 2) - Cache max_content_items() and max_content_length() once per span, pass through chain - Add TRACELOOP_TRACE_CONTENT_MAX_LENGTH env var (default 16KB) to cap serialized content - Merge batch response counting + content iteration into single-pass loop
Details
Add OpenTelemetry instrumentation for Azure AI Search (
azure-search-documentsSDK). This instrumentation provides comprehensive observability for Azure AI Search operations including document search, indexing, and management operations.Features
SearchClient instrumentation (10 methods):
search,get_document,get_document_count,upload_documents,merge_documents,delete_documents,merge_or_upload_documents,index_documents,autocomplete,suggestSearchIndexClient instrumentation (15 methods): Index CRUD, listing, statistics, text analysis, synonym map CRUD, and name-only listing
SearchIndexerClient instrumentation (21 methods): Indexer, data source connection, and skillset management including name-only listing methods
SearchIndexingBufferedSender instrumentation (6 methods): Buffered document operations and flush
Async support: All 52 sync methods also instrumented for async variants (104 total instrumented methods)
Vector/Hybrid/Semantic search support: Captures
query_type,top_k, filters, vector query kind, vector weight, and oversampling parametersResponse attribute capturing: Search results count, document batch success/failure counts, autocomplete/suggest results count, service statistics, scoring profiles
Content capture with toggle: Request/response content captured as indexed span attributes (documents, suggestions, vector embeddings), gated by
TRACELOOP_TRACE_CONTENTenv var (default: enabled). Uses indexed attributes (notspan.add_event()) so content is visible in APM backends like Elastic APM.OpenTelemetry semantic conventions compliance: 44 custom span attributes following GenAI specification patterns
Error handling: All attribute extraction wrapped with
@dont_throwdecorator for safetySemantic Conventions Added (44 attributes)
Testing
258 tests total (228 unit tests + 30 integration tests)
30 VCR cassettes for API call recording/replay
Unit tests cover all attribute extraction, dispatch pipelines, content capture toggle, error handling, and instrumentor lifecycle
14 multi-step workflow tests (7 sync + 7 async) validate traces tell a debuggable story for production troubleshooting: search pipeline, document lifecycle CRUD, typeahead pipeline, bulk ingestion partial failure, index management pipeline, content privacy toggle, and error-then-retry
Integration tests cover SearchClient, SearchIndexClient, SearchIndexerClient, and SynonymMap operations with real Azure API responses
Screenshots (Elastic APM / Kibana)
Tested with the sample app (
azure_search_app.py) against a live Azure AI Search instance.Trace Waterfall — Index, Upload, and Merge Operations
azure_hotel_search_demo.workflow(8.6s) showingcreate_index,get_service_statistics,upload_documents,merge_documents, andbatch_index_documentsspans with color-coded types (service, internal, Azure AI Search, HTTP):Upload Documents — Content Capture as Indexed Span Attributes
azure_search.upload_documentsspan detail showing input documents captured asdb.query.result.document.{i}and indexing results asdb.query.result.metadata.{i}, plusdocument_count,succeeded_count, andfailed_count:Trace Waterfall — Search, Vector, Hybrid, and Autocomplete Operations
Continuation showing
text_search,vector_search,hybrid_search,search_with_scoring_profile, andautocompletespans with nestedazure_search.searchand HTTP POST calls:Autocomplete — Span Attributes and Content Capture
azure_search.autocompletespan withindex_name,search_text=lux,suggester_name=sg, content captured asdb.search.result.entity.0={"text": "luxury", "query_plus_text": "luxury"}, andautocomplete_results_count=1:Fixes #2303
feat(instrumentation): ...orfix(instrumentation): ....Summary by CodeRabbit