test: add infra-independent unit tests for core agent protocols#151
test: add infra-independent unit tests for core agent protocols#151STiFLeR7 wants to merge 2 commits into
Conversation
|
This PR focuses on protocol and infrastructure correctness (A2A, scheduling, metrics). |
Paraschamoli
left a comment
There was a problem hiding this comment.
This pr is great but it has merge conflicts because this PR has been open since mid-February, it is currently experiencing heavy merge conflicts, specifically in test_redis_scheduler.py. Could you please fetch the latest main and rebase?
|
Sure will work on it and update it according to the new push history. |
db4d3b9 to
83ddae2
Compare
|
@Paraschamoli I have resolved the merge conflicts, updated all tests to match the new directory layout, improved Windows compatibility for the test suite, and rebased onto the latest upstream main. All 1114+ tests now pass cleanly. |
📝 WalkthroughWalkthroughThis pull request adds and updates unit tests across four test modules to improve platform compatibility and expand test coverage. Changes include platform-aware permission checks for Windows, a new comprehensive test suite for A2A protocol endpoints, middleware error resilience testing, a full test module for Redis scheduler operations, and explicit UTF-8 encoding for README file assertions. ChangesTest Coverage Expansion
🎯 2 (Simple) | ⏱️ ~12 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
tests/unit/auth/test_hydra_registration.py (1)
79-82: 💤 Low valueConsider moving the
sysimport to module level.The platform check is correct and necessary for Windows compatibility. However, importing
syslocally within the test is unconventional. For consistency with the rest of the file and Python conventions, consider moving the import to the top of the file alongside the other imports (after line 7).♻️ Proposed refactor
At module level (after line 7):
from unittest.mock import AsyncMock, MagicMock, patch +import sys import pytestIn the test (lines 79-82):
- # Check permissions (owner read/write only) on non-Windows platforms - import sys + # Check permissions (owner read/write only) on non-Windows platforms if sys.platform != "win32": assert oct(creds_file.stat().st_mode)[-3:] == "600"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/auth/test_hydra_registration.py` around lines 79 - 82, Move the local import of sys to the module level: add "import sys" with the other imports near the top of tests/unit/auth/test_hydra_registration.py (after the existing imports around line 7), then remove the inline "import sys" inside the test so the platform check uses the top-level sys reference (the assertion using creds_file.stat().st_mode and sys.platform should remain unchanged).tests/unit/server/scheduler/test_redis_scheduler.py (1)
120-132: ⚡ Quick winAvoid
RuntimeError("StopLoop")as the generator stop signal.Using an unrelated exception to escape
receive_task_operations()makes these tests depend on the current exception-handling policy inside the loop. If that code is later hardened to catch generic exceptions, these tests will hang instead of failing cleanly. Prefer consuming a single item withanext(...)and then closing the async generator explicitly.♻️ Safer pattern
+import asyncio import json import uuid from unittest.mock import AsyncMock, MagicMock, patch @@ - mock_redis_client.blpop.side_effect = [ - ("bindu:tasks", json.dumps(task_data)), - RuntimeError("StopLoop") # Break the infinite loop for testing - ] - - # Consume generator - try: - async for operation in scheduler.receive_task_operations(): - assert operation["operation"] == "run" - assert operation["params"]["to"] == "agent-id" - break # Only verify first item - except RuntimeError: - pass # Expected from side_effect + mock_redis_client.blpop.side_effect = [ + ("bindu:tasks", json.dumps(task_data)), + ] + + operations = scheduler.receive_task_operations() + operation = await asyncio.wait_for(anext(operations), timeout=1) + assert operation["operation"] == "run" + assert operation["params"]["to"] == "agent-id" + await operations.aclose()Also applies to: 149-162, 183-197
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/server/scheduler/test_redis_scheduler.py` around lines 120 - 132, Replace the RuntimeError-based stop in the test that sets mock_redis_client.blpop.side_effect and instead consume a single item from the async generator returned by scheduler.receive_task_operations() using anext(...) (await anext(generator)) to get and assert the first operation, then explicitly close the generator with await generator.aclose(); remove the RuntimeError side_effect and the try/except that swallows it, and update the other similar test blocks (the ones using mock_redis_client.blpop.side_effect with RuntimeError) to follow the same anext + aclose pattern so the tests don't rely on throwing unrelated exceptions to stop the generator.tests/unit/server/middleware/test_metrics.py (1)
100-128: ⚡ Quick winStrengthen test assertions for completeness.
The test correctly verifies error resilience, but could be more complete by asserting:
increment_requests_in_flightwas called (matching the pattern intest_dispatch_decrements_on_errorat lines 77-98)record_http_requestwas called (proving the error path was actually exercised, not accidentally skipped)🧪 Suggested additions
# Should not raise exception response = await middleware.dispatch(mock_request, mock_call_next) assert response == mock_response + mock_metrics.increment_requests_in_flight.assert_called_once() + mock_metrics.record_http_request.assert_called_once() mock_metrics.decrement_requests_in_flight.assert_called_once() # cleanup always runs🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/server/middleware/test_metrics.py` around lines 100 - 128, Add assertions to test_dispatch_ignores_recording_error to ensure the error path was exercised: after awaiting middleware.dispatch, assert mock_metrics.increment_requests_in_flight.assert_called_once() and mock_metrics.record_http_request.assert_called_once() (keeping the existing mock_metrics.decrement_requests_in_flight.assert_called_once()). This updates the test for MetricsMiddleware to verify both the increment and the attempted record call in addition to the cleanup decrement.tests/unit/server/endpoints/test_a2a_protocol.py (2)
54-62: ⚡ Quick winConsider adding explicit mock for
get_client_ip.The endpoint calls
get_client_ip(request)at its entry point. WhileMagicMockreturns a mock by default, explicitly mocking this dependency makes the test more readable and resilient to changes.🔧 Suggested enhancement
+ with patch("bindu.server.endpoints.a2a_protocol.get_client_ip", return_value="127.0.0.1"): mock_request = MagicMock(spec=Request) mock_request.body = AsyncMock(return_value=json.dumps(body).encode()) class MockState: pass mock_request.state = MockState() # Call endpoint response = await agent_run_endpoint(mock_app, mock_request)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/server/endpoints/test_a2a_protocol.py` around lines 54 - 62, The test should explicitly patch get_client_ip used by agent_run_endpoint to make the dependency clear and stable; update the test to mock get_client_ip (e.g., via monkeypatch or unittest.mock.patch) so it returns a deterministic IP and attach that to the request context before calling agent_run_endpoint, referencing the existing mock_request and the agent_run_endpoint function to locate where to inject the mock.
74-82: ⚡ Quick winEnhance test to validate JSON-RPC error structure.
The test currently only checks the HTTP status code. According to the PR objectives to verify JSON-RPC contract correctness, the test should also validate the error response structure includes the expected
codeandmessagefields fromJSONParseError.📋 Suggested enhancement
response = await agent_run_endpoint(mock_app, mock_request) assert response.status_code == 400 + content = json.loads(response.body) + assert "error" in content + assert content["error"]["code"] == -32700 # JSON Parse Error + assert "message" in content["error"]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/server/endpoints/test_a2a_protocol.py` around lines 74 - 82, Update the test_invalid_json test to assert the JSON-RPC error payload as well as the HTTP status: after calling agent_run_endpoint(mock_app, mock_request) parse the response JSON and assert it contains an "error" object with "code" and "message" fields that match the JSONParseError contract (use the JSONParseError.code and JSONParseError.message symbols or their constants from the JSON-RPC errors module), and keep the existing assert that response.status_code == 400; locate this logic in the test_invalid_json function and add the JSON body assertions right after the agent_run_endpoint call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/server/endpoints/test_a2a_protocol.py`:
- Line 26: The mock handler return value is not a complete JSON-RPC 2.0
response—update the AsyncMock assigned to app.task_manager.mock_handler so its
return_value includes the full JSON-RPC structure (include "jsonrpc": "2.0",
"result": "success", and an "id" field) and then adjust the test assertions that
inspect the response (the assertions referencing content/result) to verify
content["jsonrpc"] == "2.0", content["result"] == "success", and that an "id"
key is present; use the existing AsyncMock on app.task_manager.mock_handler and
the existing response assertions to locate where to change the mock and add the
new checks.
- Around line 85-101: The test test_unsupported_method is asserting a 400
because agent_run_endpoint first calls a2a_request_ta.validate_json (A2ARequest
is a discriminated union on "method") so an unknown "method" literal triggers
JSONParseError → 400; update the test name/comment to reflect it validates
schema/400 rather than handler-not-found/404, and add a new test that sends a
valid method literal (one of A2ARequest's method tags) but ensures that method
is not present in app_settings.agent.method_handlers to exercise the
MethodNotFoundError branch and assert a 404 from agent_run_endpoint.
In `@tests/unit/server/scheduler/test_redis_scheduler.py`:
- Around line 67-75: The test only asserts "span_id" exists in the serialized
payload but misses "trace_id", so update the assertions in the push-path test
(the block that loads payload = json.loads(args[1])) to also assert that
"trace_id" is present (e.g., assert "trace_id" in payload) and do the same
addition for the second push-block around lines 87-102; keep checks for
payload["operation"], payload["params"], and span_id as-is and simply add the
trace_id presence assertion to lock down the serialized tracing contract.
---
Nitpick comments:
In `@tests/unit/auth/test_hydra_registration.py`:
- Around line 79-82: Move the local import of sys to the module level: add
"import sys" with the other imports near the top of
tests/unit/auth/test_hydra_registration.py (after the existing imports around
line 7), then remove the inline "import sys" inside the test so the platform
check uses the top-level sys reference (the assertion using
creds_file.stat().st_mode and sys.platform should remain unchanged).
In `@tests/unit/server/endpoints/test_a2a_protocol.py`:
- Around line 54-62: The test should explicitly patch get_client_ip used by
agent_run_endpoint to make the dependency clear and stable; update the test to
mock get_client_ip (e.g., via monkeypatch or unittest.mock.patch) so it returns
a deterministic IP and attach that to the request context before calling
agent_run_endpoint, referencing the existing mock_request and the
agent_run_endpoint function to locate where to inject the mock.
- Around line 74-82: Update the test_invalid_json test to assert the JSON-RPC
error payload as well as the HTTP status: after calling
agent_run_endpoint(mock_app, mock_request) parse the response JSON and assert it
contains an "error" object with "code" and "message" fields that match the
JSONParseError contract (use the JSONParseError.code and JSONParseError.message
symbols or their constants from the JSON-RPC errors module), and keep the
existing assert that response.status_code == 400; locate this logic in the
test_invalid_json function and add the JSON body assertions right after the
agent_run_endpoint call.
In `@tests/unit/server/middleware/test_metrics.py`:
- Around line 100-128: Add assertions to test_dispatch_ignores_recording_error
to ensure the error path was exercised: after awaiting middleware.dispatch,
assert mock_metrics.increment_requests_in_flight.assert_called_once() and
mock_metrics.record_http_request.assert_called_once() (keeping the existing
mock_metrics.decrement_requests_in_flight.assert_called_once()). This updates
the test for MetricsMiddleware to verify both the increment and the attempted
record call in addition to the cleanup decrement.
In `@tests/unit/server/scheduler/test_redis_scheduler.py`:
- Around line 120-132: Replace the RuntimeError-based stop in the test that sets
mock_redis_client.blpop.side_effect and instead consume a single item from the
async generator returned by scheduler.receive_task_operations() using anext(...)
(await anext(generator)) to get and assert the first operation, then explicitly
close the generator with await generator.aclose(); remove the RuntimeError
side_effect and the try/except that swallows it, and update the other similar
test blocks (the ones using mock_redis_client.blpop.side_effect with
RuntimeError) to follow the same anext + aclose pattern so the tests don't rely
on throwing unrelated exceptions to stop the generator.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6edc6d70-70c3-4654-81a8-0e925cc072ec
📒 Files selected for processing (5)
tests/unit/auth/test_hydra_registration.pytests/unit/server/endpoints/test_a2a_protocol.pytests/unit/server/middleware/test_metrics.pytests/unit/server/scheduler/test_redis_scheduler.pytests/unit/test_minimax_example.py
| app = MagicMock(spec=BinduApplication) | ||
| app.task_manager = MagicMock() | ||
| # Mock the handler method on task_manager | ||
| app.task_manager.mock_handler = AsyncMock(return_value={"result": "success"}) |
There was a problem hiding this comment.
Mock handler should return complete JSON-RPC 2.0 response structure.
The handler currently returns {"result": "success"}, but a valid JSON-RPC 2.0 response must include "jsonrpc": "2.0" and "id" fields. Real handlers return complete responses, and this test should validate the full protocol contract.
📋 Proposed fix to return complete JSON-RPC response
- app.task_manager.mock_handler = AsyncMock(return_value={"result": "success"})
+ app.task_manager.mock_handler = AsyncMock(
+ return_value={"jsonrpc": "2.0", "result": "success", "id": None}
+ )Then update the test assertion at line 71 to validate all required fields:
assert content["jsonrpc"] == "2.0"
assert content["result"] == "success"
assert "id" in content📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| app.task_manager.mock_handler = AsyncMock(return_value={"result": "success"}) | |
| app.task_manager.mock_handler = AsyncMock( | |
| return_value={"jsonrpc": "2.0", "result": "success", "id": None} | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/server/endpoints/test_a2a_protocol.py` at line 26, The mock
handler return value is not a complete JSON-RPC 2.0 response—update the
AsyncMock assigned to app.task_manager.mock_handler so its return_value includes
the full JSON-RPC structure (include "jsonrpc": "2.0", "result": "success", and
an "id" field) and then adjust the test assertions that inspect the response
(the assertions referencing content/result) to verify content["jsonrpc"] ==
"2.0", content["result"] == "success", and that an "id" key is present; use the
existing AsyncMock on app.task_manager.mock_handler and the existing response
assertions to locate where to change the mock and add the new checks.
| @pytest.mark.asyncio | ||
| async def test_unsupported_method(mock_app, mock_settings): | ||
| """Test handling of unsupported method.""" | ||
| body = { | ||
| "jsonrpc": "2.0", | ||
| "method": "unknown/method", | ||
| "id": str(uuid.uuid4()), | ||
| } | ||
|
|
||
| mock_request = MagicMock(spec=Request) | ||
| mock_request.body = AsyncMock(return_value=json.dumps(body).encode()) | ||
|
|
||
| response = await agent_run_endpoint(mock_app, mock_request) | ||
|
|
||
| assert response.status_code == 400 | ||
| # Validation fails for unknown method tag in discriminated union | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the Pydantic schema definition for A2A requests to verify valid method tags
rg -nP -A5 'class.*A2ARequest|a2a_request_ta.*=' --type=pyRepository: GetBindu/Bindu
Length of output: 580
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Find where A2ARequest is defined and how the discriminator works
rg -n "class A2ARequest|A2ARequest\s*=" bindu/common/protocol/types.py
rg -n "discriminator|Field\\(.*discriminator|tag\\b|method\\b.*discriminator" bindu/common/protocol/types.py
# 2) Show the relevant region around A2ARequest
python3 - <<'PY'
import re, pathlib
p = pathlib.Path("bindu/common/protocol/types.py")
txt = p.read_text(encoding="utf-8")
m = re.search(r"class\s+A2ARequest\b", txt)
if not m:
# fallback: look for type alias
m = re.search(r"A2ARequest\s*=", txt)
if not m:
print("Could not find A2ARequest definition marker")
raise SystemExit(1)
start = max(0, txt.count("\n", 0, m.start()) - 20)
# compute line number for m.start()
line = txt[:m.start()].count("\n") + 1
print(f"A2ARequest definition around line ~{line}")
PY
# Print a bounded chunk around the definition (first hit by line number)
LINE=$(python3 - <<'PY'
import re, pathlib
p = pathlib.Path("bindu/common/protocol/types.py")
txt = p.read_text(encoding="utf-8")
for pat in [r"class\s+A2ARequest\b", r"A2ARequest\s*="]:
m = re.search(pat, txt)
if m:
line = txt[:m.start()].count("\n")+1
print(line)
break
else:
raise SystemExit(1)
PY
)
sed -n "$((LINE-40)),$((LINE+120))p" bindu/common/protocol/types.py
# 3) Inspect endpoint error paths in a2a_protocol.py
rg -n "agent_run_endpoint|status_code\s*==|return .*400|return .*404|MethodNotFoundError|JSONParseError|validation" tests -S
rg -n "agent_run_endpoint" -S
fd -e py a2a_protocol.py bindu tests || true
# Use fd to locate if different path exists
fd -e py "a2a_protocol.py" . || true
# Then display surrounding code where 400/404 are set
FOUND=$(fd -e py "a2a_protocol.py" . | head -n 1)
if [ -n "$FOUND" ]; then
rg -n "MethodNotFoundError|JSONParseError|status_code|return" "$FOUND"
# Show the chunk around where the handler-not-found logic likely is
rg -n "MethodNotFoundError" "$FOUND" | head -n 1 | awk -F: '{print $1}' | while read -r L; do
sed -n "$((L-60)),$((L+80))p" "$FOUND"
done
fiRepository: GetBindu/Bindu
Length of output: 52212
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show the test code around the target lines
sed -n '70,130p' tests/unit/server/endpoints/test_a2a_protocol.py
# Locate and show endpoint error handling blocks (avoid negative ranges)
rg -n "MethodNotFoundError|JSONParseError|status_code\s*=\s*400|status_code\s*=\s*404" bindu/server/endpoints/a2a_protocol.py
rg -n "agent_run_endpoint" bindu/server/endpoints/a2a_protocol.py
# Show a chunk around the endpoint's main logic
# (use line numbers from rg to pick a non-negative window)
ENDPOINT_LINE=$(rg -n "async def agent_run_endpoint" bindu/server/endpoints/a2a_protocol.py | head -n 1 | cut -d: -f1)
START=$((ENDPOINT_LINE-80)); if [ "$START" -lt 1 ]; then START=1; fi
sed -n "${START},$((ENDPOINT_LINE+220))p" bindu/server/endpoints/a2a_protocol.pyRepository: GetBindu/Bindu
Length of output: 9984
Clarify error path: test_unsupported_method is validating the request schema (HTTP 400), not testing handler-not-found (HTTP 404).
agent_run_endpoint first parses the body with a2a_request_ta.validate_json(data); A2ARequest is a discriminated union on "method" with fixed Literal[...] method tags, so "unknown/method" fails schema validation and returns the JSONParseError → 400. The later MethodNotFoundError → 404 branch only applies when "method" is valid but not present in app_settings.agent.method_handlers.
- Rename the test (and/or comment) to reflect the 400/validation case
- Add a separate test for the 404 handler-not-found path using a valid method tag that isn’t mapped
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/server/endpoints/test_a2a_protocol.py` around lines 85 - 101, The
test test_unsupported_method is asserting a 400 because agent_run_endpoint first
calls a2a_request_ta.validate_json (A2ARequest is a discriminated union on
"method") so an unknown "method" literal triggers JSONParseError → 400; update
the test name/comment to reflect it validates schema/400 rather than
handler-not-found/404, and add a new test that sends a valid method literal (one
of A2ARequest's method tags) but ensures that method is not present in
app_settings.agent.method_handlers to exercise the MethodNotFoundError branch
and assert a 404 from agent_run_endpoint.
| # Verify serialization | ||
| payload = json.loads(args[1]) | ||
| assert payload["operation"] == "run" | ||
| assert payload["params"]["to"] == "agent-id" | ||
| assert payload["params"]["input"] == {"text": "hello"} | ||
| # Span ID might be null or present depending on OpenTelemetry mock state, | ||
| # checking structure is enough. | ||
| assert "span_id" in payload | ||
|
|
There was a problem hiding this comment.
Assert trace_id on the pushed payloads too.
These push-path tests only lock down span_id. The receive tests build operation envelopes with both tracing fields, so a regression that drops trace_id would still pass here even though it changes the serialized contract.
Also applies to: 87-102
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/server/scheduler/test_redis_scheduler.py` around lines 67 - 75,
The test only asserts "span_id" exists in the serialized payload but misses
"trace_id", so update the assertions in the push-path test (the block that loads
payload = json.loads(args[1])) to also assert that "trace_id" is present (e.g.,
assert "trace_id" in payload) and do the same addition for the second push-block
around lines 87-102; keep checks for payload["operation"], payload["params"],
and span_id as-is and simply add the trace_id presence assertion to lock down
the serialized tracing contract.
Summary
This PR adds fast, infrastructure-independent unit tests covering Bindu’s core agent communication and scheduling primitives. The goal is to make protocol-level correctness easy to validate locally and in CI, without requiring live Redis, Web3, or native crypto builds.
What’s Covered
A2A Protocol
Redis Scheduler
Metrics Middleware
Test Infrastructure Improvements
Why This Matters
This PR intentionally focuses on infrastructure correctness rather than agent behavior, laying groundwork for higher-level agent workflows.
Summary by CodeRabbit