Skip to content

fix(observability): demote responses usage-limit caps#4239

Open
YOMXXX wants to merge 4 commits into
tinyhumansai:mainfrom
YOMXXX:fix/GH-4180-responses-usage-limit
Open

fix(observability): demote responses usage-limit caps#4239
YOMXXX wants to merge 4 commits into
tinyhumansai:mainfrom
YOMXXX:fix/GH-4180-responses-usage-limit

Conversation

@YOMXXX

@YOMXXX YOMXXX commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Extends the shared provider quota matcher to cover OpenAI/ChatGPT OAuth Responses usage_limit_reached bodies.
  • Adds a responses_api emit-site demotion so 400 usage-cap responses still return errors without reporting to Sentry.
  • Adds regression coverage for matcher, before_send filter, and the actual Responses API Sentry leak path.
  • Carries the current docs link-check fixes from fix(observability): demote MCP auth-required 401 #4237 so this PR can pass CI while that PR is still pending.

Problem

  • The existing quota-exhaustion classifier covers C9A-style monthly quota wording but misses OpenAI Responses bodies with usage_limit_reached and “The usage limit has been reached”.
  • The Responses API path therefore falls through to should_report_provider_http_failure(400) and reports every background retry to Sentry.
  • This creates the active Quota demote coverage gap: usage_limit_reached on OpenAI Responses (TAURI-RUST-AFE) #4180 Sentry flood even though the underlying condition is user/account plan state, not an actionable OpenHuman error.
  • The repo link-check gate currently also sees stale localized README links on base; those fixes are included here as CI hygiene until fix(observability): demote MCP auth-required 401 #4237 merges.

Solution

  • Adds usage_limit_reached and usage limit has been reached to body_indicates_quota_exhausted, preserving the existing shared single source of truth.
  • Adds a chat_via_responses quota arm before the report fallthrough, reusing log_provider_quota_exhausted with operation=responses_api.
  • Keeps the caller-facing error unchanged so the agent/runtime can still surface the cap state.
  • Updates stale localized README links and the Cursor cloud-agent docs link to their current targets.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — focused Rust tests cover the changed branches; CI remains the source of truth for the enforced diff-cover gate.
  • Coverage matrix updated — N/A: behavior-only observability/classification change, no user-facing feature row added/removed/renamed.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no coverage-matrix feature ID applies.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: observability filtering only.
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime/platform: Rust core compatible-provider error classification only.
  • User-visible effect: OpenAI Responses usage-cap failures still propagate as errors; only Sentry reporting changes.
  • Observability effect: usage_limit_reached plan-cap retries are demoted from Sentry errors; generic 400/500/provider failures stay reportable.
  • Docs: localized README links are updated to the same current targets used by fix(observability): demote MCP auth-required 401 #4237.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/GH-4180-responses-usage-limit
  • Commit SHA: 9c3d7931840eb9a55e692fd4e527b56b3a221960

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: no app TS/CSS changes.
  • pnpm typecheck — N/A: no frontend TypeScript changes.
  • Focused tests:
    • RED: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml --lib quota_exhausted_matches_openai_responses_usage_limit_reached failed before implementation.
    • RED: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml --lib responses_api_usage_limit_reached_400_not_reported_to_sentry failed before implementation.
    • GREEN: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml --lib usage_limit
    • GREEN: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml --lib quota_exhausted
  • Rust fmt/check (if changed):
    • cargo fmt --manifest-path Cargo.toml --check
    • git diff --check
    • GGML_NATIVE=OFF cargo check --manifest-path Cargo.toml
  • Tauri fmt/check (if changed): N/A: no Tauri shell changes.

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: OpenAI Responses usage-cap 400s are treated as quota/user plan state for Sentry reporting.
  • User-visible effect: request still returns the provider error; repeated Sentry events are suppressed.

Parity Contract

  • Legacy behavior preserved: existing C9A quota, insufficient-credit, transient, and unrelated error handling remain unchanged.
  • Guard/fallback/dispatch parity checks: shared matcher drives both emit-site classification and before_send filtering.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: This PR
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • Documentation

    • Updated several translated README pages with refreshed links and embeds.
    • Improved the Star History display link and corrected a few feature reference URLs.
  • Bug Fixes

    • Expanded quota-exhaustion detection for OpenAI Responses-style errors.
    • Reduced false error reporting for “usage limit reached” responses, helping avoid unnecessary alerts while preserving the original error message.

@YOMXXX YOMXXX requested a review from a team June 28, 2026 16:31
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Extends the existing quota-exhaustion classifier to match OpenAI Responses API usage_limit_reached errors, wires the new branch into chat_via_responses, and adds unit and integration tests. Separately updates localized README files to remove Reddit hyperlinks, fix native-voice doc URLs, and switch Star History embeds to the SVG endpoint.

Changes

Quota exhaustion: usage_limit_reached coverage

Layer / File(s) Summary
Extend body matcher
src/openhuman/inference/provider/ops/http_error.rs
body_indicates_quota_exhausted extended to match usage_limit_reached and "usage limit has been reached"; new AFE_BODY fixture and unit test added.
Wire quota branch in chat_via_responses
src/openhuman/inference/provider/compatible_helpers.rs
New else if branch calls is_provider_quota_exhausted and routes matching errors through log_provider_quota_exhausted before the generic Sentry reporter.
Integration and observability tests
src/openhuman/inference/provider/compatible_tests.rs, src/core/observability.rs
OPENAI_AFE_USAGE_LIMIT_BODY fixture added; integration test asserts zero Sentry events for mocked 400 usage_limit_reached; observability test asserts correct classification by is_quota_exhausted_event and is_quota_exhausted_message.

Doc link updates

Layer / File(s) Summary
Localized README and agent-workflow link fixes
docs/README.de.md, docs/README.ja-JP.md, docs/README.ko.md, docs/README.ur-pk.md, docs/README.zh-CN.md, docs/agent-workflows/cursor-cloud-agents.md
Reddit anchors replaced with plain text, native-voice URL updated to features/native-tools/voice, Star History embeds switched to api.star-history.com SVG endpoint, and Cursor Cloud Agents URL corrected.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • tinyhumansai/openhuman#4076: Introduced the quota-exhaustion classifier (body_indicates_quota_exhausted, is_quota_exhausted_event) that this PR extends with the usage_limit_reached phrase.
  • tinyhumansai/openhuman#3976: Added an earlier else if demotion branch in chat_via_responses for 402 insufficient credits, the same pattern this PR replicates for quota exhaustion.
  • tinyhumansai/openhuman#3617: Prior work on classifying insufficient-credits and suppressing Sentry events in the same observability + HTTP error matcher path.

Suggested labels

rust-core, sentry-traced-bug, bug

Suggested reviewers

  • graycyrus
  • oxoxDev

🐇 A rabbit once heard a loud Sentry alarm,
"usage_limit_reached!" — causing quite a farm!
So we patched the matcher, added a test or two,
Now quota exhaustion is silenced on cue.
The READMEs got tidied, links pointing true~ 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The localized README and Cursor docs link updates are unrelated to #4180 and appear to be bundled ancillary changes. Split the docs/link-only edits into a separate PR unless they are required for this fix.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title concisely matches the main observability fix for Responses usage-limit caps.
Linked Issues check ✅ Passed The PR matches #4180 by classifying usage_limit_reached as quota exhaustion and suppressing Sentry while preserving the error path.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot added bug rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage labels Jun 28, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/openhuman/inference/provider/compatible_helpers.rs (1)

225-231: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a debug/trace breadcrumb on the new quota-exhausted branch.

This new error branch only emits the info! demotion inside log_provider_quota_exhausted, so the branch itself still lacks the debug/trace-level instrumentation required for new Rust control-flow paths here.

Suggested change
         } else if super::super::is_provider_quota_exhausted(&error) {
+            tracing::debug!(
+                operation = "responses_api",
+                provider = self.name.as_str(),
+                model,
+                status = status.as_u16(),
+                "responses_api matched provider quota exhaustion"
+            );
             super::super::log_provider_quota_exhausted(
                 "responses_api",
                 self.name.as_str(),
                 Some(model),
                 status,

As per coding guidelines, "Add debug logging to entry/exit, branches, external calls, retries/timeouts, state transitions, and errors using log/tracing at debug/trace level in Rust".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/provider/compatible_helpers.rs` around lines 225 -
231, The new quota-exhausted branch in
compatible_helpers::ResponsesApiProvider::... should add a debug/trace
breadcrumb before or alongside the call to log_provider_quota_exhausted so the
control-flow path is visible at debug/trace level. Update the quota-exhausted
branch in this function to emit a trace/debug log with the provider name, model,
and status before delegating, while keeping the existing info-level demotion in
log_provider_quota_exhausted.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/inference/provider/compatible_helpers.rs`:
- Around line 225-231: The new quota-exhausted branch in
compatible_helpers::ResponsesApiProvider::... should add a debug/trace
breadcrumb before or alongside the call to log_provider_quota_exhausted so the
control-flow path is visible at debug/trace level. Update the quota-exhausted
branch in this function to emit a trace/debug log with the provider name, model,
and status before delegating, while keeping the existing info-level demotion in
log_provider_quota_exhausted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 325489cd-ab56-4f6d-8f50-737f1ba1b444

📥 Commits

Reviewing files that changed from the base of the PR and between 5a41a4f and cfd32c9.

📒 Files selected for processing (10)
  • docs/README.de.md
  • docs/README.ja-JP.md
  • docs/README.ko.md
  • docs/README.ur-pk.md
  • docs/README.zh-CN.md
  • docs/agent-workflows/cursor-cloud-agents.md
  • src/core/observability.rs
  • src/openhuman/inference/provider/compatible_helpers.rs
  • src/openhuman/inference/provider/compatible_tests.rs
  • src/openhuman/inference/provider/ops/http_error.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Quota demote coverage gap: usage_limit_reached on OpenAI Responses (TAURI-RUST-AFE)

1 participant