-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
We’ve encountered a reproducible issue when using the Java Vertex AI client (com.google.cloud:google-cloud-vertexai:1.18.0) with Gemini Flash models for structured text generation.
When a responseSchema is attached to the GenerationConfig, the model intermittently produces malformed or repetitive JSON outputs, often looping text fragments or inserting stray newline escape sequences until the max output token limit is reached.
Removing the schema entirely eliminates the issue, and the same prompt setup works correctly in the Python Vertex AI SDK, suggesting this may be SDK-specific or related to how the Java client serializes the schema.
Environment details
| Key | Value |
|---|---|
| API | Vertex AI Generative AI (Java) |
| Library | com.google.cloud:google-cloud-vertexai:1.18.0 |
| Java version | 21 |
| OS | Windows 11 |
| Models tested | Gemini 2.0 Flash, Gemini 2.5 Flash Lite, Gemini 2.5 Flash |
| Behavior | Issue occurs with 2.0 Flash and 2.5 Flash Lite; 2.5 Flash mitigates it partially |
Steps to reproduce
-
Configure a
GenerativeModelwith deterministic decoding:-
temperature = 0.0f,topP = 0.0f,topK = 1,candidateCount = 1,seed = 42 -
responseMimeType = "application/json"
-
-
Attach a complex
responseSchemadescribing nested arrays and objects (see example below). -
Send a document-extraction prompt requesting structured JSON per the schema.
-
Observe that:
-
The model often ignores the schema’s structure.
-
Output becomes recursive or repetitive (
"Company Company Company..."). -
Output terminates abruptly at token limit with unclosed quotes or brackets.
-
-
Remove the schema (keep all other settings identical).
-
Observe that the output is now clean and well-formed JSON.
Code snippet (simplified)
GenerationConfig cfg = GenerationConfig.newBuilder() .setTemperature(0.0f) .setTopP(0.0f) .setTopK(1) .setCandidateCount(1) .setSeed(42) .setResponseMimeType("application/json") .setResponseSchema(ResponseSchemaFactory.getExtractionSchema()) // When set, issue occurs .build();GenerativeModel model = baseModel
.withSystemInstruction(ContentMaker.fromString(systemPrompt))
.withGenerationConfig(cfg);
GenerateContentResponse response = model.generateContent(promptText);
String jsonOutput = response.getText(); // often malformed
Example schema shape:
Schema workExperience = Schema.newBuilder()
.setType(Type.OBJECT)
.putProperties("company", Schema.newBuilder().setType(Type.STRING).build())
.putProperties("tenure", Schema.newBuilder().setType(Type.STRING).build())
.putProperties("skills", Schema.newBuilder()
.setType(Type.ARRAY)
.setItems(Schema.newBuilder().setType(Type.STRING).build())
.build())
.build();
Observed output (excerpt, simulated)
{
"experience": [
{
"company": "TechCorp TechCorp TechCorp TechCorp TechCorp ...",
"tenure": "2 yrs",
"skills": ["Java", "Spring Boot"]
}
],
"summary": "\n\n {\n.\n.\n\\n\\n\\n\\n\\n\\n\\n\\n\n"
}
Occasionally, the output fails JSON parsing due to missing closing quotes or brackets:
com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input:
was expecting closing quote for a string value
at [Source: (String)"{ "experience": [ { "company": "ABC
"tenure": "3 yrs"...]; line: 1, column: 4211]
Expected behavior
When responseSchema is provided, the model should consistently honor the schema and produce syntactically valid JSON following the defined structure.
Additional context
-
Removing the schema entirely fixes the problem.
-
Using identical prompts and schema definitions in Python Vertex AI SDK does not reproduce the issue.
-
Switching to Gemini 2.5 Flash improves output stability, possibly due to increased reasoning or token budget.
-
This suggests the issue may lie in schema serialization or how the Java SDK encodes the request payload.
Would appreciate guidance on whether this is:
-
A known limitation or bug in the Java Vertex AI client,
-
A misalignment between the Java SDK’s schema format and backend expectations,
-
Or a potential model-side behavior that needs handling guidance.