fix(tool): stop leaking exception details in chat_completion error response#1876
fix(tool): stop leaking exception details in chat_completion error response#1876ColinM-sys wants to merge 6 commits intoNVIDIA:developfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughHandler error path now logs full exceptions server-side and returns a sanitized user-facing message without exception details; extraction failures for the fallback message are also logged. New tests validate sanitized responses and that Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/nvidia_nat_core/src/nat/tool/chat_completion.py (1)
120-139: Add a regression test for the sanitized error path.This security fix is easy to regress, and there is no coverage asserting that exception text stays out of the caller-visible response. Please add a test that forces the error path and verifies the returned message omits internal exception details.
As per coding guidelines, "Maintain >= 80% test coverage; add or update tests when introducing changes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py` around lines 120 - 139, Add a regression unit test that exercises the exception branch in the chat completion handler to ensure internal exception details are never returned; specifically, simulate/force the code path that triggers the except Exception as e block (for example by mocking the component that performs the chat completion to raise), call the public function that wraps this logic (the chat completion entrypoint in chat_completion.py that logs via logger.exception and uses GlobalTypeConverter.get().convert to extract last message), and assert the returned string contains the sanitized last user message (if any) but does NOT contain the exception message, exception type names, traceback fragments, file paths, or any other internal details; name the test clearly (e.g., test_chat_completion_sanitizes_error_response) and use mocks to control GlobalTypeConverter.get().convert behavior to exercise both with and without an available last message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py`:
- Around line 134-135: The except block that currently calls
logger.debug("failed to extract last user message for error response",
exc_info=True) should be changed to use logger.exception(...) so the full stack
trace is logged at error level; locate the except in chat_completion.py that
guards the "extract last user message for error response" logic (the try/except
around extracting the last user message) and replace the logger.debug call with
logger.exception("failed to extract last user message for error response") to
follow the repo's exception-logging guideline.
---
Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/tool/chat_completion.py`:
- Around line 120-139: Add a regression unit test that exercises the exception
branch in the chat completion handler to ensure internal exception details are
never returned; specifically, simulate/force the code path that triggers the
except Exception as e block (for example by mocking the component that performs
the chat completion to raise), call the public function that wraps this logic
(the chat completion entrypoint in chat_completion.py that logs via
logger.exception and uses GlobalTypeConverter.get().convert to extract last
message), and assert the returned string contains the sanitized last user
message (if any) but does NOT contain the exception message, exception type
names, traceback fragments, file paths, or any other internal details; name the
test clearly (e.g., test_chat_completion_sanitizes_error_response) and use mocks
to control GlobalTypeConverter.get().convert behavior to exercise both with and
without an available last message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 612513a8-d39a-4f7d-bd9f-9c1cd16ba61f
📒 Files selected for processing (1)
packages/nvidia_nat_core/src/nat/tool/chat_completion.py
Per CodeRabbit feedback on NVIDIA#1876: 1. Promote the inner except-block logger call from `logger.debug(..., exc_info=True)` to `logger.exception(...)` for the extract-last-message failure path. Matches the repo's exception-logging convention and ensures the stack trace is recorded at ERROR level alongside the primary "chat completion failed" log. 2. Add regression tests that force the exception branch with a mocked failing LLM and assert that the caller-visible response NEVER contains: - the exception message (matched via a unique sentinel string) - the exception class name (RuntimeError / ValueError) while still: - echoing the user's last message in the apology - triggering `logger.exception("chat completion failed")` so operators keep full traceback visibility server-side The tests drive `register_chat_completion` as an async generator, pull the registered callable out of the yielded FunctionInfo, and invoke it directly — no live LLM required. Three parametrized cases cover the happy path, the user-query-echo path, and the server-side logger contract. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/nvidia_nat_core/tests/nat/tools/test_chat_completion_error_response.py (1)
63-116: Consider extracting repeated setup/teardown into a named pytest fixture.The
failing_llm+_get_registered_callable(...)+await gen.aclose()pattern is repeated across all three tests; a fixture would reduce duplication and drift risk.As per coding guidelines: “Any frequently repeated code should be extracted into pytest fixtures. Pytest fixtures should define the name argument when applying the pytest.fixture decorator.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_core/tests/nat/tools/test_chat_completion_error_response.py` around lines 63 - 116, Extract the repeated setup/teardown into a pytest fixture that yields (failing_llm, fn, gen): move the failing_llm = AsyncMock(), failing_llm.ainvoke.side_effect assignment, and the await _get_registered_callable(...) call into a fixture decorated with `@pytest.fixture`(name="registered_failing_llm") (use the name= argument per guidelines), yield the tuple, and ensure teardown calls await gen.aclose() after yield; then update the three tests (test_error_response_drops_exception_message, test_error_response_echoes_user_query_but_not_exception, test_server_side_logger_still_captures_full_exception) to accept the fixture and remove their duplicated setup/teardown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_core/tests/nat/tools/test_chat_completion_error_response.py`:
- Around line 49-57: The helper _get_registered_callable (which calls
register_chat_completion to get gen and fn_info) currently raises RuntimeError
on failure without closing the async generator gen; update the function so that
if no callable is found on fn_info (loop over "single_fn","fn","func","_fn"),
you first close the generator by awaiting gen.aclose() (or call aclose() in a
finally/except path) and then raise the RuntimeError; ensure this uses the
existing local names gen and fn_info and does not swallow the original error
flow.
---
Nitpick comments:
In
`@packages/nvidia_nat_core/tests/nat/tools/test_chat_completion_error_response.py`:
- Around line 63-116: Extract the repeated setup/teardown into a pytest fixture
that yields (failing_llm, fn, gen): move the failing_llm = AsyncMock(),
failing_llm.ainvoke.side_effect assignment, and the await
_get_registered_callable(...) call into a fixture decorated with
`@pytest.fixture`(name="registered_failing_llm") (use the name= argument per
guidelines), yield the tuple, and ensure teardown calls await gen.aclose() after
yield; then update the three tests (test_error_response_drops_exception_message,
test_error_response_echoes_user_query_but_not_exception,
test_server_side_logger_still_captures_full_exception) to accept the fixture and
remove their duplicated setup/teardown.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: a98163ef-326d-4a12-8c67-7187c4c715da
📒 Files selected for processing (2)
packages/nvidia_nat_core/src/nat/tool/chat_completion.pypackages/nvidia_nat_core/tests/nat/tools/test_chat_completion_error_response.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/nvidia_nat_core/src/nat/tool/chat_completion.py
…ilure Per CodeRabbit on NVIDIA#1876: 1. The repeated `AsyncMock` setup + `_get_registered_callable(...)` + `await gen.aclose()` pattern across all three tests is now a pytest fixture pair: - `failing_llm_runtime_error` — wires a RuntimeError(_SENTINEL) - `failing_llm_value_error` — wires a ValueError("boom {_SENTINEL}") Each fixture yields `(fn, llm)` and handles gen.aclose() on teardown. Fixtures use the `name=` argument per the repo's coding guideline. 2. `_get_registered_callable` now closes the async generator before raising RuntimeError when no callable is found on the yielded FunctionInfo. Previously the helper raised without closing, leaking an open async generator on teardown. Also dropped the unused `LLMFrameworkEnum` import that snuck in with the earlier commit. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
|
/ok to test 75fd9a0 |
|
/ok to test cb4b3cb |
Per CodeRabbit feedback on NVIDIA#1876: 1. Promote the inner except-block logger call from `logger.debug(..., exc_info=True)` to `logger.exception(...)` for the extract-last-message failure path. Matches the repo's exception-logging convention and ensures the stack trace is recorded at ERROR level alongside the primary "chat completion failed" log. 2. Add regression tests that force the exception branch with a mocked failing LLM and assert that the caller-visible response NEVER contains: - the exception message (matched via a unique sentinel string) - the exception class name (RuntimeError / ValueError) while still: - echoing the user's last message in the apology - triggering `logger.exception("chat completion failed")` so operators keep full traceback visibility server-side The tests drive `register_chat_completion` as an async generator, pull the registered callable out of the yielded FunctionInfo, and invoke it directly — no live LLM required. Three parametrized cases cover the happy path, the user-query-echo path, and the server-side logger contract. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
…ilure Per CodeRabbit on NVIDIA#1876: 1. The repeated `AsyncMock` setup + `_get_registered_callable(...)` + `await gen.aclose()` pattern across all three tests is now a pytest fixture pair: - `failing_llm_runtime_error` — wires a RuntimeError(_SENTINEL) - `failing_llm_value_error` — wires a ValueError("boom {_SENTINEL}") Each fixture yields `(fn, llm)` and handles gen.aclose() on teardown. Fixtures use the `name=` argument per the repo's coding guideline. 2. `_get_registered_callable` now closes the async generator before raising RuntimeError when no callable is found on the yielded FunctionInfo. Previously the helper raised without closing, leaking an open async generator on teardown. Also dropped the unused `LLMFrameworkEnum` import that snuck in with the earlier commit. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
406351a to
50cbab7
Compare
Per CodeRabbit feedback on NVIDIA#1876: 1. Promote the inner except-block logger call from `logger.debug(..., exc_info=True)` to `logger.exception(...)` for the extract-last-message failure path. Matches the repo's exception-logging convention and ensures the stack trace is recorded at ERROR level alongside the primary "chat completion failed" log. 2. Add regression tests that force the exception branch with a mocked failing LLM and assert that the caller-visible response NEVER contains: - the exception message (matched via a unique sentinel string) - the exception class name (RuntimeError / ValueError) while still: - echoing the user's last message in the apology - triggering `logger.exception("chat completion failed")` so operators keep full traceback visibility server-side The tests drive `register_chat_completion` as an async generator, pull the registered callable out of the yielded FunctionInfo, and invoke it directly — no live LLM required. Three parametrized cases cover the happy path, the user-query-echo path, and the server-side logger contract. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
…ilure Per CodeRabbit on NVIDIA#1876: 1. The repeated `AsyncMock` setup + `_get_registered_callable(...)` + `await gen.aclose()` pattern across all three tests is now a pytest fixture pair: - `failing_llm_runtime_error` — wires a RuntimeError(_SENTINEL) - `failing_llm_value_error` — wires a ValueError("boom {_SENTINEL}") Each fixture yields `(fn, llm)` and handles gen.aclose() on teardown. Fixtures use the `name=` argument per the repo's coding guideline. 2. `_get_registered_callable` now closes the async generator before raising RuntimeError when no callable is found on the yielded FunctionInfo. Previously the helper raised without closing, leaking an open async generator on teardown. Also dropped the unused `LLMFrameworkEnum` import that snuck in with the earlier commit. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
50cbab7 to
a133e0b
Compare
|
/ok to test 03a7212 |
@dagardner-nv, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/ |
|
/ok to test a133e0b |
a133e0b to
274d880
Compare
Per CodeRabbit feedback on NVIDIA#1876: 1. Promote the inner except-block logger call from `logger.debug(..., exc_info=True)` to `logger.exception(...)` for the extract-last-message failure path. Matches the repo's exception-logging convention and ensures the stack trace is recorded at ERROR level alongside the primary "chat completion failed" log. 2. Add regression tests that force the exception branch with a mocked failing LLM and assert that the caller-visible response NEVER contains: - the exception message (matched via a unique sentinel string) - the exception class name (RuntimeError / ValueError) while still: - echoing the user's last message in the apology - triggering `logger.exception("chat completion failed")` so operators keep full traceback visibility server-side The tests drive `register_chat_completion` as an async generator, pull the registered callable out of the yielded FunctionInfo, and invoke it directly — no live LLM required. Three parametrized cases cover the happy path, the user-query-echo path, and the server-side logger contract. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
…ilure Per CodeRabbit on NVIDIA#1876: 1. The repeated `AsyncMock` setup + `_get_registered_callable(...)` + `await gen.aclose()` pattern across all three tests is now a pytest fixture pair: - `failing_llm_runtime_error` — wires a RuntimeError(_SENTINEL) - `failing_llm_value_error` — wires a ValueError("boom {_SENTINEL}") Each fixture yields `(fn, llm)` and handles gen.aclose() on teardown. Fixtures use the `name=` argument per the repo's coding guideline. 2. `_get_registered_callable` now closes the async generator before raising RuntimeError when no callable is found on the yielded FunctionInfo. Previously the helper raised without closing, leaking an open async generator on teardown. Also dropped the unused `LLMFrameworkEnum` import that snuck in with the earlier commit. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
…sponse
The `_chat_completion` handler in `chat_completion.py` previously
concatenated `str(e)` from the caught exception into the user-facing
response string ("...Error: {str(e)}"). Since the response is returned
over the network to API callers, that leaked internal details such as:
- Stack frame class names and messages
- Database driver errors revealing table/column names
- HTTP client errors revealing endpoint URLs / API key prefixes
- File paths from disk-backed resources
Move the exception detail to server-side logging
(`logger.exception("chat completion failed")`) and return only the
user-safe apology + query echo. Operators can still triage via logs.
Also replaced the bare `except Exception: pass` on the last-message
extraction with `logger.debug(..., exc_info=True)` so silent failures
there no longer vanish from observability.
CWE-209 (Information Exposure Through an Error Message).
Note: this module has no existing unit test scaffold, and the
error-path requires a mocked-failing LLM in the NAT builder chain
to exercise end-to-end. The change is a defensive string / logging
rewrite with no behavior on the happy path, so adding a harness just
for this PR would significantly expand scope. Happy to follow up with
a ToolTestRunner-based error-path test in a separate PR if reviewers
prefer.
Signed-off-by: ColinM-sys <cmcdonough@50words.com>
Per CodeRabbit feedback on NVIDIA#1876: 1. Promote the inner except-block logger call from `logger.debug(..., exc_info=True)` to `logger.exception(...)` for the extract-last-message failure path. Matches the repo's exception-logging convention and ensures the stack trace is recorded at ERROR level alongside the primary "chat completion failed" log. 2. Add regression tests that force the exception branch with a mocked failing LLM and assert that the caller-visible response NEVER contains: - the exception message (matched via a unique sentinel string) - the exception class name (RuntimeError / ValueError) while still: - echoing the user's last message in the apology - triggering `logger.exception("chat completion failed")` so operators keep full traceback visibility server-side The tests drive `register_chat_completion` as an async generator, pull the registered callable out of the yielded FunctionInfo, and invoke it directly — no live LLM required. Three parametrized cases cover the happy path, the user-query-echo path, and the server-side logger contract. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
…ilure Per CodeRabbit on NVIDIA#1876: 1. The repeated `AsyncMock` setup + `_get_registered_callable(...)` + `await gen.aclose()` pattern across all three tests is now a pytest fixture pair: - `failing_llm_runtime_error` — wires a RuntimeError(_SENTINEL) - `failing_llm_value_error` — wires a ValueError("boom {_SENTINEL}") Each fixture yields `(fn, llm)` and handles gen.aclose() on teardown. Fixtures use the `name=` argument per the repo's coding guideline. 2. `_get_registered_callable` now closes the async generator before raising RuntimeError when no callable is found on the yielded FunctionInfo. Previously the helper raised without closing, leaking an open async generator on teardown. Also dropped the unused `LLMFrameworkEnum` import that snuck in with the earlier commit. Signed-off-by: ColinM-sys <cmcdonough@50words.com>
- Remove unneeded explicit `del e`; let the variable fall out of scope naturally at the end of the except block (per dagardner-nv review). - Update test file copyright from 2025-2026 to 2026. Signed-off-by: Colin McDonough <cmcdonough@50words.com> Signed-off-by: ColinM-sys <cmcdonough@50words.com>
- `except Exception as e:` → `except Exception:` (unused binding) - Remove extra blank line before _SENTINEL in test file Signed-off-by: Colin McDonough <cmcdonough@50words.com> Signed-off-by: ColinM-sys <cmcdonough@50words.com>
274d880 to
3bb7627
Compare
|
/ok to test be5aaf4 |
|
@ColinM-sys tests are failing, please run tests locally, refer to |
Summary
The
_chat_completionerror handler inpackages/nvidia_nat_core/src/nat/tool/chat_completion.pyconcatenatedstr(e)from the caught exception into the user-facing response string ("...Error: {str(e)}"). Since the response is returned over the network to API callers, that leaked internal details — stack frame class names, DB driver messages revealing schema, HTTP client errors revealing endpoint URLs / key prefixes, and file paths.Move the exception detail to server-side logging (
logger.exception) and return only the user-safe apology + query echo. Operators triage via logs; callers no longer see internal state.What changed
packages/nvidia_nat_core/src/nat/tool/chat_completion.pylogger = logging.getLogger(__name__)Error: {str(e)}suffix with server-sidelogger.exception("chat completion failed")except Exception: passon the last-message extraction withlogger.exception(...)for observabilityCWE
CWE-209 — Information Exposure Through an Error Message
Why this matters
Caller-visible error responses in an LLM / agent pipeline often get echoed into chat UIs, logged downstream, or stored in conversation history. Leaking database schemas, endpoint URLs, or file paths in those surfaces is a small but real hardening gap. Happy-path output is unchanged.
Test plan
logger = logging.getLogger(__name__))Signed-off-by: Colin McDonough cmcdonough@50words.com