DRAFT: Align telemetry middleware with MCP OTEL semantic conventions#3683
Closed
ChrisJBurns wants to merge 5 commits intomainfrom
Closed
DRAFT: Align telemetry middleware with MCP OTEL semantic conventions#3683ChrisJBurns wants to merge 5 commits intomainfrom
ChrisJBurns wants to merge 5 commits intomainfrom
Conversation
Add MetaCarrier (TextMapCarrier for MCP _meta fields) and InjectMetaTraceContext for injecting traceparent/tracestate into outgoing MCP requests, per the MCP OTEL specification. This enables distributed tracing across vMCP → backend boundaries using the standard W3C Trace Context format propagated through MCP params._meta. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename attributes to match the MCP OpenTelemetry specification:
- mcp.method → mcp.method.name
- mcp.request.id → jsonrpc.request.id
- mcp.tool.name → gen_ai.tool.name
- mcp.tool.arguments → gen_ai.tool.call.arguments
- mcp.prompt.name → gen_ai.prompt.name
- mcp.transport → network.transport (with standard value mapping)
Add standard metrics: mcp.server.operation.duration and
mcp.server.session.duration histograms with spec-defined bucket
boundaries. Add session tracking with TTL-based cleanup.
Update span naming to "{method} {target}" format, add error.type
attribute, client.address/port, and gen_ai.operation.name.
Addresses #3399.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5 tasks
6 tasks
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3683 +/- ##
==========================================
+ Coverage 66.26% 66.30% +0.03%
==========================================
Files 427 427
Lines 41765 41923 +158
==========================================
+ Hits 27676 27795 +119
- Misses 11977 12010 +33
- Partials 2112 2118 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Chris Burns <29541485+ChrisJBurns@users.noreply.github.com>
- Source session ID from Mcp-Session-Id HTTP header instead of _meta - Derive protocol version from actual HTTP request, not transport type - Update HTTP attributes to stable OTEL semantic conventions - Emit mcp.server.session.duration on HTTP DELETE session termination - Unexport MCPOperationDurationBuckets (no external consumers) - Propagate metric creation errors with no-op fallback - Initialize tool call counter once at startup instead of per-request - Add telemetry migration guide for renamed/new/removed attributes - Remove old-to-new attribute name comments from code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Per OTEL HTTP semantic conventions for server spans, 4xx client errors should leave span status unset rather than setting it to Error. Only 5xx server errors should set codes.Error and the error.type attribute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mcp.method.name,jsonrpc.request.id,gen_ai.tool.name,gen_ai.tool.call.arguments,gen_ai.prompt.name,network.transport)http.request.method,url.full,http.response.status_code, etc.)mcp.server.operation.durationandmcp.server.session.durationhistograms with spec-defined bucket boundariesmcp.server.session.durationon HTTP DELETE session termination (per MCP streamable-http spec)Mcp-Session-IdHTTP header instead of JSON-RPC_metanetwork.protocol.versionfrom actual HTTP request instead of hardcoding per transport type{method} {target}format (e.g.,tools/call get_weather)error.type,client.address/client.port,gen_ai.operation.nameattributesnetwork.transport/network.protocol.namemcpOperationDurationBuckets(no external consumers)http.duration_msattribute (span timestamps capture duration)docs/telemetry-migration.md) with attribute rename tables, PromQL examples, and migration checklistmcp.resource.uriattribute forresources/readcontains()withslices.Contains()in testst.Parallel()on test that mutates env vars (//nolint:paralleltest,tparallel)Addresses #3399.
PR Stack (2/3): #3682 (propagation) → This PR → #3684 (client spans)
Test plan
mcp.server.operation.durationmetric with correct attributesRecordSessionEndmapTransport,parseRemoteAddr,httpProtocolVersiongo build ./...passesgo test ./pkg/telemetry/...passestask lint-fixpasses with 0 issuesScreenshots
Traces in tempo/grafana
Metrics in Prometheus/Grafana
🤖 Generated with Claude Code