-
Notifications
You must be signed in to change notification settings - Fork 95
fix: Normalize JSON schemas for OpenAI structured output compatibility #1915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Azure OpenAI requires ALL properties in the required array for structured output. Added SchemaProcessor.makeAllPropertiesRequired() that recursively ensures all properties are marked as required. Applied to data component and artifact schemas. Co-authored-by: Cursor <[email protected]>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: cbfd98e The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Review Summary
3 Key Findings | Risk: Low
🟠⚠️ Major (1) 🟠⚠️
🟠 1) SchemaProcessor.ts:231-233 Return type loses precision, forcing type assertions at call sites
Issue: The function signature makeAllPropertiesRequired(schema: JSONSchema.BaseSchema | Record<string, unknown> | null | undefined) returns the same union type regardless of input. When a non-null JSONSchema.BaseSchema is passed, the return type still includes null | undefined | Record<string, unknown>, forcing every call site to use type assertions.
Why: This creates a pattern where type safety violations are scattered across the codebase (as JSONSchema.BaseSchema at line 109, as any at line 166) rather than contained in one place. If z.fromJSONSchema() type expectations change, the assertions mask the incompatibility.
Fix: Consider using TypeScript overloads or generics to preserve input type precision:
// Option 1: Overloads
static makeAllPropertiesRequired(schema: null | undefined): null | undefined;
static makeAllPropertiesRequired(schema: JSONSchema.BaseSchema): JSONSchema.BaseSchema;
static makeAllPropertiesRequired(
schema: JSONSchema.BaseSchema | Record<string, unknown> | null | undefined
): JSONSchema.BaseSchema | Record<string, unknown> | null | undefined {
// implementation unchanged
}
// Option 2: Generic
static makeAllPropertiesRequired<T extends JSONSchema.BaseSchema | null | undefined>(
schema: T
): T {
// implementation unchanged
}Refs:
-
🟡 Minor:
artifact-component-schema.ts:166Prefer narrower type assertion (see inline comment)
🟡 Minor (1) 🟡
🟡 1) artifact-component-schema.ts:34-39 ArtifactReferenceSchema not normalized for consistency
Issue: ArtifactReferenceSchema.getSchema() uses z.fromJSONSchema() on ARTIFACT_PROPS_SCHEMA without applying makeAllPropertiesRequired(), while ArtifactCreateSchema methods (lines 104, 158) do apply normalization.
Why: While ARTIFACT_PROPS_SCHEMA already has all properties in the required array (line 28), the inconsistent pattern could cause confusion or bugs if the schema is modified in the future.
Fix: Either apply normalization for consistency, or add a comment explaining why it's unnecessary:
// Option A: Apply normalization for consistency
const normalizedSchema = SchemaProcessor.makeAllPropertiesRequired(ARTIFACT_PROPS_SCHEMA);
return z.fromJSONSchema(normalizedSchema as JSONSchema.BaseSchema);
// Option B: Document why normalization is skipped
// Note: normalization skipped - all properties already in required array
return z.fromJSONSchema(ARTIFACT_PROPS_SCHEMA);Refs:
💭 Consider (3) 💭
💭 1) SchemaProcessor.test.ts Missing test for $ref schema references
Issue: The makeAllPropertiesRequired method does not handle JSON Schema $ref properties, and there are no tests documenting this behavior.
Why: If a data component schema uses $ref for reusable definitions (common in OpenAPI-derived schemas), referenced object properties would not be normalized, causing the same Azure error this PR fixes.
Fix: Add a test documenting the known limitation, or extend the implementation to resolve $ref references.
💭 2) Agent.ts:3497-3499 Document selective normalization rationale
Issue: The PR normalizes data component and artifact schemas but not function tool input schemas (line 1462) or tool override schemas (line 2177).
Why: This may be intentional (different schema sources/requirements), but a code comment would help future maintainers understand why normalization is applied selectively.
Fix: Add a brief comment explaining that normalization is specifically for structured output compatibility with OpenAI/Azure, not for tool input validation.
💭 3) SchemaProcessor.test.ts Consider testing additionalProperties preservation
Issue: OpenAI structured outputs require additionalProperties: false. The implementation should preserve this, but there's no explicit test.
Why: A future refactor could accidentally remove additionalProperties from normalized schemas.
Fix: Add: it('should preserve additionalProperties when present', () => { ... })
💡 APPROVE WITH SUGGESTIONS
Summary: This is a well-implemented fix for a real cross-provider compatibility issue with comprehensive test coverage. The core makeAllPropertiesRequired() logic is correct and thoroughly tested. The suggestions above are refinements for type safety and consistency — none are blockers. The approach of normalizing at usage time (rather than storage time) is the right design choice. Nice work splitting this out from #1836 for a focused, reviewable PR! 🎉
Discarded (6)
| Location | Issue | Reason Discarded |
|---|---|---|
SchemaProcessor.ts:238-266 |
Internal any typing loses type safety |
Acceptable trade-off for recursive JSON schema transformation; function is well-tested |
SchemaProcessor.test.ts |
Missing tests for tuple-style array items | Edge case unlikely in LLM structured outputs; low risk |
Agent.ts:3498 |
No integration test for data component normalization | Unit tests provide good coverage; integration testing structured output with real providers is complex |
model-context-utils.test.ts |
Unrelated mock added | Out of scope — appears to be a legitimate test isolation fix |
SchemaProcessor.test.ts |
Only covers new method, not existing methods | Acceptable for focused PR; existing methods have different scope |
| Evaluation/render routes | z.fromJSONSchema() not normalized |
Out of scope — these are separate call sites not touched by this PR |
Reviewers (5)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-types |
4 | 1 | 0 | 0 | 1 | 0 | 2 |
pr-review-tests |
4 | 0 | 2 | 0 | 0 | 0 | 2 |
pr-review-consistency |
4 | 1 | 1 | 0 | 0 | 0 | 2 |
pr-review-llm |
1 | 0 | 0 | 0 | 0 | 0 | 1 |
pr-review-standards |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 13 | 2 | 3 | 0 | 1 | 0 | 7 |
| name: `ArtifactCreate_${component.name}`, | ||
| description: `Create ${component.name} artifacts from tool results by extracting structured data using selectors.`, | ||
| props: propsSchema, | ||
| props: normalizedPropsSchema as any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 MINOR: Prefer narrower type assertion over as any
Issue: Using as any completely disables type checking for this assignment.
Why: If the expected type of props in DataComponentInsert changes, TypeScript won't catch the mismatch. A more precise assertion would maintain some type safety.
Fix:
| props: normalizedPropsSchema as any, | |
| props: normalizedPropsSchema as Record<string, unknown>, |
Refs:
Summary
SchemaProcessor.makeAllPropertiesRequired()that recursively ensures all properties in JSON schemas are marked as required before conversion to Zod schemasAgent.tsand artifact component schemas inartifact-component-schema.tsSplit from #1836 — this PR contains only the JSON schema normalization work.
Context: Azure OpenAI's structured output API requires ALL properties to be in the
requiredarray, even nullable/optional ones. Data components with optional properties (likefact: z.string().nullable()) failed with:Test plan
SchemaProcessor.makeAllPropertiesRequired()covering flat schemas, nested objects, arrays, union types, edge cases, and real-world schemasMade with Cursor