feat: Add new spec for A2A protocol conformance tests by darrelmiller · Pull Request #1882 · a2aproject/A2A

darrelmiller · 2026-05-26T16:03:44Z

Summary

This PR introduces ACTS (A2A Conformance Test Specification) — a unified YAML format for A2A protocol conformance tests,plus 111 tests covering all testable MUST/SHOULD requirements from the A2A v1.0 spec.

Problem

The A2A ecosystem has 4 fragmented conformance testing efforts (a2a-tck, a2a-itk, agntcy/csit, agentbin), each withdifferent test definitions. This makes it impossible to verify that SDKs passing "conformance tests" can actuallyinteroperate.

Solution

One canonical test format that all SDKs test against. Language-agnostic YAML declarations + freedom to implement runners in any language = guaranteed interoperability.

What's Included

📋 Specification (docs/acts-specification.md)

~2000 lines, 17 sections, inline CDDL grammar (RFC 8610 conformant)
Abstract operations (transport-agnostic: JSON-RPC, gRPC, REST)
Rich assertion DSL, streaming support, state machine validation
Standard JSON report format for dashboards

✅ Test Suite (111 tests across 14 files)

72 MUST, 35 SHOULD, 4 MAY tests
Core ops, discovery, streaming, errors, multi-turn, auth, push notifications, history, polling, wire format, data types, version negotiation, transport bindings, client parsing
Full coverage: All 76 testable spec requirements mapped

🎨 HTML Viewer (tests/acts/test-viewer.html)

Interactive browser with search, filter, syntax highlighting
290 KB standalone file, zero dependencies

Coverage Validation

✅ Spec audit: All 76 testable MUST/SHOULD requirements have tests
✅ Gap analysis: All scenarios from 4 existing repos covered
✅ CDDL validation: All files validate against RFC 8610 grammar

Introduces a language-neutral YAML format for declaring conformance tests that A2A SDKs must pass. Key design decisions: - Abstract operations instead of wire methods for transport independence - Three conformance levels (must/should/may) per RFC 2119 - Client golden-response tests for interop bug coverage - SUT behavior contract via message-prefix convention - Inline CDDL type definitions throughout the spec Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Spec additions: - §13 Report Format: standardized JSON output schema with CDDL for dashboards to consume results from any runner implementation - Appendix A: consolidated report CDDL added - Sections 14-17 renumbered accordingly Test suites (tests/acts/): - suite.acts.yaml: top-level manifest including all suites - discovery.acts.yaml: 6 tests (CARD-DISC-*) - core-operations.acts.yaml: 7 tests (CORE-SEND/GET/CANCEL-*) - multi-turn.acts.yaml: 3 tests (CORE-MULTI-*) - streaming.acts.yaml: 6 tests (STREAM-SSE/SUB-*) - polling.acts.yaml: 2 tests (CORE-EXEC-*) - error-handling.acts.yaml: 7 tests (CORE-ERR-*, JSONRPC-ERR-*) - wire-format.acts.yaml: 3 tests (DM-FMT-*) - data-types.acts.yaml: 4 tests (DM-ART-*) - push-notifications.acts.yaml: 4 tests (PUSH-CFG-*) - client-parsing.acts.yaml: 6 tests (CLIENT-PARSE-*) Tests synthesized from a2a-tck, a2a-itk, agntcy/csit, and agentbin to validate the ACTS format covers real-world scenarios. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

New test files: - history.acts.yaml (6 tests: history length, ordering, content) - version-negotiation.acts.yaml (2 tests: version errors, defaults) - transport-bindings.acts.yaml (8 tests: JSON-RPC, REST, gRPC) Additions to existing files: - core-operations: +3 (failure, content-type error, list tasks) - streaming: +5 (first event, message-only, concurrent, resubscribe) - error-handling: +4 (malformed request, capability errors, error data) - discovery: +3 (caching, schema validation, extended card) - data-types: +3 (timestamps, schema validation, tolerance) - push-notifications: +6 (list, errors, idempotent delete, delivery) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@type

Cross-referenced every MUST and SHOULD in the A2A specification against existing ACTS tests. Found 23 uncovered testable requirements and created tests for all of them. New test files: - auth-security.acts.yaml (12 tests: auth rejection, extended card access, in-task AUTH_REQUIRED state, push webhook auth) Additions to existing files: - core-operations: +2 (ListTasks includeArtifacts, nextPageToken) - multi-turn: +2 (mismatched contextId/taskId, rejected client contextId) - error-handling: +3 (error @type field, ErrorInfo, missing-vs-unauthorized) - discovery: +1 (protocol declaration in Agent Card) - transport-bindings: +1 (REST application/a2a+json content type) - client-parsing: +2 (extended card caching, capability checking) Coverage: 76 testable spec requirements now have matching tests. 99 additional requirements are process/documentation/deployment concerns not testable via conformance tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Reconcile 4 critical naming mismatches between spec and tests: - action/request -> operation/params (JSON-RPC alignment) - raw_request/raw_expect -> raw/expect (flat block, reuse expect) - golden_response -> client_response (clearer naming) - rawBody -> body_raw (snake_case consistency) Add 3 structural improvements identified during format review: - status field in expect-block (tests already use it) - runner_requirements enum for runner-special tests - named-assertion / assertions block (used by 9 tests) Updated both inline CDDL and Appendix A consolidated grammar. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

§12.6 stated the report format was 'not prescribed' (MAY), but §13 mandates a JSON report format (MUST). Updated §12.6 to reference §13 and align the normative language. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Interactive viewer with search, filter by level/tag - Syntax-highlighted YAML rendering - Expand/collapse all tests - Stats dashboard showing 111 tests (MUST/SHOULD/MAY breakdown) - Standalone HTML file, zero dependencies

gemini-code-assist

Code Review

This pull request introduces the A2A Conformance Test Specification (ACTS) along with a comprehensive suite of YAML-based conformance tests covering core operations, streaming, discovery, error handling, and security. The review feedback identifies several schema and specification violations across the test files, including incorrect abstract operation names, invalid keys in error expectations, mismatched field names for file content, and incorrect JSON-RPC error codes. Additionally, the gRPC-to-HTTP error mapping table in the specification document should be updated to prioritize canonical transcoding mappings.

gemini-code-assist · 2026-05-26T16:05:30Z

+          - id: set-config
+            operation: set_push_notification_config


The abstract operation set_push_notification_config is used here, but it is not defined in the ACTS specification. According to the specification, the correct abstract operation name is create_push_config.

- id: set-config operation: create_push_config

gemini-code-assist · 2026-05-26T16:05:30Z

+            expect_error:
+              code: UnsupportedOperationError


The expect_error block uses the key code to specify the expected error type. However, the ACTS specification defines this field as error_type. Using code will cause schema validation failures in compliant test runners.

expect_error: error_type: UnsupportedOperationError

gemini-code-assist · 2026-05-26T16:05:30Z

+                        - file:
+                            name: "report.pdf"
+                            mediaType: "application/pdf"
+                            bytes: "JVBERi0xLjQ="


The field name bytes is used under the file object to represent the base64-encoded file content. However, both the ACTS specification and the underlying Part protobuf definition use the field name raw for this purpose.

- file: name: "report.pdf" mediaType: "application/pdf" raw: "JVBERi0xLjQ="

gemini-code-assist · 2026-05-26T16:05:30Z

+            expect:
+              status: error


The expect block specifies status: error. According to the ACTS specification, status must be a valid HTTP status code (integer) or a numeric assertion. If the request is expected to fail, you should use the expect_error block instead of expect.

expect_error: error_type: InvalidParamsError

gemini-code-assist · 2026-05-26T16:05:30Z

+                error:
+                  code: -32009


The expected JSON-RPC error code is set to -32009. However, the ACTS specification defines the JSON-RPC code for VersionNotSupportedError as -32006. This mismatch will cause conformant implementations to fail this test.

error: code: -32006

gemini-code-assist · 2026-05-26T16:05:30Z

+| `TaskNotCancelableError` | -32002 | `FAILED_PRECONDITION` | 409 |
+| `UnsupportedOperationError` | -32004 | `UNIMPLEMENTED` | 405 |
+| `ContentTypeNotSupportedError` | -32005 | `INVALID_ARGUMENT` | 415 |
+| `InvalidParamsError` | -32602 | `INVALID_ARGUMENT` | 400 |
+| `VersionNotSupportedError` | -32006 | `UNIMPLEMENTED` | 406 |


The mapping of gRPC statuses to HTTP statuses in this table does not align with the general rule to prioritize compatibility with gRPC transcoding systems by using the canonical mapping. Specifically:

TaskNotCancelableError (gRPC FAILED_PRECONDITION) should map to HTTP 400 (canonical mapping for FAILED_PRECONDITION), not 409.

UnsupportedOperationError (gRPC UNIMPLEMENTED) should map to HTTP 501 (canonical mapping for UNIMPLEMENTED), not 405.

VersionNotSupportedError (gRPC UNIMPLEMENTED) should map to HTTP 501, not 406.

ContentTypeNotSupportedError (gRPC INVALID_ARGUMENT) should map to HTTP 400, not 415.

References

When mapping gRPC statuses to HTTP statuses, prioritize compatibility with gRPC transcoding systems by using the canonical mapping, even if a more specific HTTP status code is available.

gemini-code-assist · 2026-05-26T16:05:30Z

+            expect:
+              error:
+                exists: true


For asserting that an operation fails, the expect_error block should be used instead of expect with an error body check. This ensures consistency across the test suite and aligns with the schema defined in the specification.

expect_error: error_type: UnsupportedOperationError

andysalvo · 2026-05-26T20:37:27Z

ACTS scopes protocol conformance cleanly. One gap worth flagging: two implementations can pass all 111 tests and still produce different cryptographic outputs from the same input — JCS canonicalization, hash derivation, and signature binding sit below the transport layer ACTS validates. We hit this recently when our own JCS serializer diverged on Unicode handling that six other implementations got right; the shared test vectors caught it, not protocol-level tests. A derivation conformance layer would compose well as a separate test class or companion suite.

MoltyCel · 2026-05-27T05:45:22Z

test (edit later)

chopmob-cloud · 2026-05-30T18:29:21Z

We have an adversarial bench live -- 138 profiles across 30 categories, tested against our own A2A agent at api.algovoi.co.uk. It sits above what ACTS validates: ACTS checks protocol compliance against MUST/SHOULD requirements, the bench tests agent behaviour under adversarial conditions (prompt injection, coercive payment patterns, wallet draining attempts, malicious tool call sequences). The two layers complement without overlapping.

On @andysalvo's point about JCS and cryptographic outputs: that layer is below protocol-observable behaviour and cannot be reached by protocol assertions -- it needs shared test vectors that implementations run locally and compare byte-for-byte. A derivation conformance suite would compose alongside ACTS as a separate test class rather than an extension of it.

Profile schema is public if it is useful for expressing an adversarial class in the ACTS format.

AlgoVoi (chopmob-cloud) -- Acquisition enquiries: https://docs.algovoi.co.uk/acquisition

chopmob-cloud · 2026-05-31T17:15:25Z

The gap is real and sits below what ACTS validates. Two implementations can pass all 111 ACTS tests and still diverge on JCS canonical output for the same input — particularly on unicode normalisation, number representation, and key ordering edge cases. ACTS validates the transport and protocol layer; it does not validate the cryptographic substrate underneath.

The 8-implementation cross-validation corpus at \chopmob-cloud/algovoi-jcs-conformance-vectors\ covers exactly this layer: Python (rfc8785), TypeScript (canonicalize), Go (gowebpki/jcs), Rust (serde_json), Java (cyberphone/json-canonicalization), PHP, .NET, Ruby — all producing byte-identical output across the same vector set, including the non-ASCII UTF-8 cases that diverge most frequently in practice.

A complete A2A conformance spec needs both layers: ACTS for protocol, a JCS byte-match suite for the cryptographic substrate. The two are complementary rather than overlapping.

AlgoVoi (chopmob-cloud) -- Acquisition enquiries: https://docs.algovoi.co.uk/acquisition

darrelmiller and others added 7 commits May 25, 2026 17:58

Add HTML test viewer for ACTS conformance suite

3b816b5

- Interactive viewer with search, filter by level/tag - Syntax-highlighted YAML rendering - Expand/collapse all tests - Stats dashboard showing 111 tests (MUST/SHOULD/MAY breakdown) - Standalone HTML file, zero dependencies

darrelmiller requested a review from a team as a code owner May 26, 2026 16:03

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

herczyn mentioned this pull request May 28, 2026

Extract test scenarios into scenarios.yaml a2aproject/a2a-itk#3

Closed

Merge branch 'main' into conformance-spec

dbcabfb

muscariello requested review from a team as code owners May 29, 2026 07:34

msampathkumar changed the title ~~Conformance spec~~ feat: Add new spec for A2A protocol conformance tests Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add new spec for A2A protocol conformance tests#1882

feat: Add new spec for A2A protocol conformance tests#1882
darrelmiller wants to merge 8 commits into
mainfrom
conformance-spec

darrelmiller commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

andysalvo commented May 26, 2026

Uh oh!

MoltyCel commented May 27, 2026

Uh oh!

chopmob-cloud commented May 30, 2026

Uh oh!

chopmob-cloud commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

darrelmiller commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

andysalvo commented May 26, 2026

Uh oh!

MoltyCel commented May 27, 2026

Uh oh!

chopmob-cloud commented May 30, 2026

Uh oh!

chopmob-cloud commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants