Upgrade llama.cpp from b9739 to b9789 and refresh Windows patch by bernardladenthin · Pull Request #271 · bernardladenthin/java-llama.cpp

bernardladenthin · 2026-06-25T11:37:45Z

Summary

Upgrade llama.cpp pinned version from b9739 to b9789 across CMake, documentation, and CI configuration
Refresh patches/0001-win32-arg-parse-embed-guard.patch to apply cleanly against b9789's new count-guard form in common_params_parse, which upstream now ships as the exact variant that breaks this project's Windows server-integration tests
Update patch documentation in CLAUDE.md and docs/history/llama-cpp-breaking-changes.md to reflect the rationale and verify no project source changes are required for the version bump

Details

llama.cpp b9739 → b9789 changes absorbed:

All breaking changes in this range are consumed inside upstream-compiled translation units (chat.cpp, server-*.cpp, common/arg.cpp, etc.). Verified via grep that the project does not directly reference any of the changed symbols:

Partial-JSON parser deletion and PEG parser refactor (json-partial.h removed)
Message-span type restructuring in common/chat.h (common_chat_msg_span, common_chat_msg_delimiter, common_chat_split_by_role() removed)
Context-checkpointing refactor (task_params::n_before_user removed, replaced by message_spans)
New llama_model_n_layer_nextn() API (not called by project)
common_params_handle_models() signature change (not called directly by project)
Multi-model router refactor in server-models.cpp (project links but does not drive)
Backend-internal work (Hexagon, Vulkan, SYCL, OpenCL, WebGPU shaders)

Windows patch refresh:

Upstream's common_params_parse argv handling evolved from the original unconditional override (llama.cpp #24779 regression in b9739) to a count-guard form in b9789:

if (static_cast<int>(utf8.buf.size()) == argc) {
    argv = utf8.ptrs.data();
}

This count-guard is exactly the variant this project identified as breaking its Windows server-integration tests (argv length coincides with java.exe's command line). The patch was refreshed to drop the new form and keep (void) utf8;, ensuring the caller's already-UTF-8 argv is always used. The patch applies cleanly and reverse-cleanly (idempotency verified) against b9789.

Documentation updates:

CLAUDE.md: Updated pinned version reference and expanded patch rationale
docs/history/llama-cpp-breaking-changes.md: Added comprehensive b9739–b9789 changelog with verification notes
README.md, TODO.md, .github/workflows/publish.yml: Updated version badge and references

Test plan

Patch applies cleanly to b9789 common/arg.cpp and reverse-applies cleanly (idempotency verified)
Local build with GIT_TAG b9789 verified clean on Linux x86_64 (GCC 13.3): cmake -B build -DBUILD_TESTING=ON && cmake --build build --config Release -j$(nproc) links libjllama.so + jllama_test with zero warnings
ctest --test-dir build --output-on-failure reports 454/454 tests passing
OuteTTS build-time extraction and Windows patch both pass their fail-loud anchor checks against b9789
CI is green on this branch

Related issues / PRs

Refs upstream llama.cpp #24779 (Windows argv regression), llama.cpp #24780 (count-guard variant)

Checklist

I have read CONTRIBUTING.md and CODE_OF_CONDUCT.md
My commits follow Conventional Commits
No security-sensitive changes

https://claude.ai/code/session_01SLQk4Fk7vk7R4f2za1KxYg

Bump the pinned llama.cpp tag and refresh the Windows argv patch for the upgraded source. Every upstream breaking change in this range is absorbed inside upstream-compiled translation units; no project C++ source edits were required. - CMakeLists.txt: GIT_TAG + LLAMA_TAG b9739 -> b9789. - README.md / CLAUDE.md / publish.yml / TODO.md: version badge, pinned- version notes, WebUI clone example, aarch64 GCC rationale. - patches/0001-win32-arg-parse-embed-guard.patch: refreshed for b9789. Upstream replaced the original #24779 argv override with the count-guard form (if utf8.buf.size() == argc), which is exactly the variant that breaks the Windows server-integration tests, so the patch still drops it entirely and keeps "(void) utf8;". Re-verified to apply and reverse-apply cleanly (idempotent) against b9789 common/arg.cpp. - docs/history/llama-cpp-breaking-changes.md: new b9739-b9789 rows (json-partial.{h,cpp} removed -> peg-parser; chat.h message-span restructure; server-task n_before_user -> message_spans; new llama_model_n_layer_nextn; mtmd/clip progress_callback; server-models child-process download refactor). Verified locally on Linux x86_64 (GCC 13.3): cmake configure passes the fail-loud OuteTTS extraction and refreshed-patch anchor checks against b9789, the full Release build links libjllama.so + jllama_test with zero warnings on any project translation unit, and ctest reports 454/454 passing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SLQk4Fk7vk7R4f2za1KxYg

…ssion PR #271 CI surfaced two model-backed Java-test failures from the b9739 -> b9789 bump (every "Java Tests" job failed; all build/C++ jobs were already green): 1. LlamaModelTest.testJsonSchemaToGrammar -- upstream json-schema-to-grammar changed where it emits the `space` rule: a closing object is now "... )? space }" (was "... )? } space") and a root-level string rule no longer appends a trailing space. Functionally equivalent, byte-different; updated the pinned expectation to the b9789 output. Verified locally against the built b9789 libjllama (jsonSchemaToGrammar is a pure JNI call, no model needed). 2. LoadProgressCallbackTest -- server_context::load_model now unconditionally installs the server's own load-progress reporter on params_base.load_progress_callback right before common_init_from_params, clobbering libjllama's LoadProgressCallback JNI trampoline (set on common_params.load_progress_callback before load_model). The callback stopped firing (zero updates) and returning false no longer aborted the load. New patches/0002-server-preserve-caller-load-progress-callback.patch guards the install behind `if (params_base.load_progress_callback == nullptr)`, so a caller-supplied callback survives; standalone llama-server (null field) is unaffected. Same JNI-vs-standalone class as patch 0001. Patch 0002 applies + reverse-applies cleanly against b9789 and compiles clean (ctest 454/454). The model-backed LoadProgressCallbackTest cannot run in the restricted sandbox (no HuggingFace model); CI will confirm. Docs: CLAUDE.md patches table + docs/history breaking-changes rows updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SLQk4Fk7vk7R4f2za1KxYg

Rework patches/0001-win32-arg-parse-embed-guard.patch from "drop the Windows argv override entirely" to the opt-in design intended for upstreaming: - common/arg.cpp: common_params_parse() now parses exactly the argv it is given (no GetCommandLineW override). A new common_params_parse_main() wrapper carries the process-command-line UTF-8 recovery (llama.cpp #24779) for the standalone tools' main(). - common/arg.h: declare common_params_parse_main(). The embedded JNI caller (jllama.cpp) already calls common_params_parse() directly, so it is respected by default and never overridden -- behaviorally identical to the previous deterministic patch for our build. The ~20 standalone main() call-site flips (common_params_parse -> _main) are left to the upstream PR, not this local patch: we don't ship those tools and a 20-file patch would be fragile across llama.cpp bumps. Verified: applies forward + reverse (idempotent) against b9789, compiles clean (no warnings on arg.cpp/arg.h), ctest 454/454. The Windows-specific behavior validates on Windows CI as before. Docs updated to the new patch shape: CLAUDE.md patches table, docs/history/llama-cpp-breaking-changes.md, TODO.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SLQk4Fk7vk7R4f2za1KxYg

Expand patches/0001-win32-arg-parse-embed-guard.patch from the arg.cpp/arg.h core into the full, submittable upstream change so it can be sent to llama.cpp verbatim and then dropped here: - common/arg.cpp + common/arg.h: common_params_parse() parses exactly the argv it is given; new common_params_parse_main() wrapper carries the Windows GetCommandLineW UTF-8 recovery (#24779) for the standalone tools. - ~34 standalone main() call sites across tools/*, examples/* and the tests/* programs flip common_params_parse(argc, argv, ...) -> common_params_parse_main(argc, argv, ...). - tests/test-arg-parser.cpp: regression case asserting common_params_parse honors a caller-supplied argv (the embedded/JNI contract). Our build compiles llama.cpp as a subproject (LLAMA_BUILD_TOOLS/TESTS OFF), so only the arg.{cpp,h} core is compiled here -- the flips + test are applied but not built in normal CI, and our embedded path (jllama.cpp -> common_params_parse) is behaviorally identical to before. Validated the flips + test via a one-off -DLLAMA_BUILD_TOOLS=ON -DLLAMA_BUILD_TESTS=ON build: the new test compiles and its asserts pass, and a flipped program (test-thread-safety) builds; test-arg-parser's only failure is its live ggml.ai download assertion (sandbox network, not the patch). Full patch applies + reverse-applies cleanly against b9789 (37 files). Docs updated (CLAUDE.md patches table, breaking-changes row, TODO.md). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SLQk4Fk7vk7R4f2za1KxYg

sonarqubecloud · 2026-06-25T16:47:14Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

bernardladenthin temporarily deployed to startgate June 25, 2026 11:37 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 25, 2026 14:20 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 25, 2026 16:29 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 25, 2026 16:44 — with GitHub Actions Inactive

bernardladenthin merged commit 212634e into main Jun 25, 2026
35 of 37 checks passed

bernardladenthin deleted the claude/keen-babbage-yjmgwh branch June 25, 2026 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade llama.cpp from b9739 to b9789 and refresh Windows patch#271

Upgrade llama.cpp from b9739 to b9789 and refresh Windows patch#271
bernardladenthin merged 4 commits into
mainfrom
claude/keen-babbage-yjmgwh

bernardladenthin commented Jun 25, 2026

Uh oh!

sonarqubecloud Bot commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bernardladenthin commented Jun 25, 2026

Summary

Details

Test plan

Related issues / PRs

Checklist

Uh oh!

sonarqubecloud Bot commented Jun 25, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants