Skip to content

Comments

fix(langchain): use message.text as content#3702

Open
ianchi wants to merge 1 commit intotraceloop:mainfrom
ianchi:langchain
Open

fix(langchain): use message.text as content#3702
ianchi wants to merge 1 commit intotraceloop:mainfrom
ianchi:langchain

Conversation

@ianchi
Copy link
Contributor

@ianchi ianchi commented Feb 21, 2026

messages have a text property that handles complex cases of text data when content block is an array.
Use this property when available to handle correctly those cases, especially needed when the array has mixed types.

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

Important

Use message.text as content in set_chat_response in span_utils.py to handle complex text data cases.

  • Behavior:
    • In set_chat_response in span_utils.py, use message.text as content when available, handling complex text data cases.
    • Fallback to message.content if message.text is unavailable, ensuring proper handling of mixed type arrays.
  • Misc:
    • No new tests or documentation updates included in this PR.

This description was created by Ellipsis for 1e581ca. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • Bug Fixes
    • Improved chat response extraction to prefer explicit message text when available, with reliable fallbacks for other message content formats (including string or JSON-encoded content). This yields more consistent and accurate handling of chat responses across varied sources and message shapes.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 1e581ca in 7 seconds. Click for details.
  • Reviewed 24 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_ubkzvS8Uypp94adQ

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

Modified content extraction in set_chat_response to prefer generation.message.text when present, otherwise fall back to generation.message.content (handling string and JSON-encoded variants). If generation.message is absent, generation.text remains as the fallback. No signature changes.

Changes

Cohort / File(s) Summary
Chat Response Content Extraction
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
Changed extraction order: check generation.message.text first, then generation.message.content with type checking/JSON serialization, then generation.text as fallback. No API/signature changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through messages, soft and light,
I peek at text to get it right,
If text is missing, content I read,
Maybe parse JSON—just what you need,
Spanning traces with bunny delight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(langchain): use message.text as content' clearly summarizes the main change: prioritizing message.text for content extraction in the langchain instrumentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py (1)

229-238: generation.text priority may shadow message.text for some providers.

The fix is correct for the common case: when message.content is a complex array, ChatGeneration.text is set to "" by LangChain's validator, so line 229 is falsy and execution reaches the message.text branch. However, some providers may explicitly populate generation.text with a partial/truncated string representation even when message.content is a list, which would silently bypass the new message.text path entirely.

Consider inverting the priority for chat-type generations so message.text is preferred when available:

♻️ Proposed refinement
-            if hasattr(generation, "text") and generation.text:
-                content = generation.text
-            elif hasattr(generation, "message") and generation.message:
-                if hasattr(generation.message, "text") and generation.message.text:
-                    content = generation.message.text
-                elif generation.message.content:
-                    if isinstance(generation.message.content, str):
-                        content = generation.message.content
-                    else:
-                        content = json.dumps(generation.message.content, cls=CallbackFilteredJSONEncoder)
+            if hasattr(generation, "message") and generation.message:
+                if hasattr(generation.message, "text") and generation.message.text:
+                    content = generation.message.text
+                elif generation.message.content:
+                    if isinstance(generation.message.content, str):
+                        content = generation.message.content
+                    else:
+                        content = json.dumps(generation.message.content, cls=CallbackFilteredJSONEncoder)
+            elif hasattr(generation, "text") and generation.text:
+                # Non-chat completions (plain Generation objects)
+                content = generation.text
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 229 - 238, Change the extraction order so chat-style generations
prefer message content over the flattened text: when a generation has a .message
and that message has usable text/content, use generation.message.text or
generation.message.content (serializing non-string content with
CallbackFilteredJSONEncoder) before falling back to generation.text; update the
logic around the generation variable in span_utils.py (the branches referencing
generation.text, generation.message, generation.message.text, and
generation.message.content) so .message is checked first for chat-type
generations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 229-238: Change the extraction order so chat-style generations
prefer message content over the flattened text: when a generation has a .message
and that message has usable text/content, use generation.message.text or
generation.message.content (serializing non-string content with
CallbackFilteredJSONEncoder) before falling back to generation.text; update the
logic around the generation variable in span_utils.py (the branches referencing
generation.text, generation.message, generation.message.text, and
generation.message.content) so .message is checked first for chat-type
generations.

@ianchi
Copy link
Contributor Author

ianchi commented Feb 21, 2026

Inverted priority of generation.text/message.text as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant