Skip to content

server: expose only the loaded model in /v1/models#419

Open
aledesogusbusiness-hue wants to merge 1 commit into
antirez:mainfrom
aledesogusbusiness-hue:fix/v1-models-only-loaded
Open

server: expose only the loaded model in /v1/models#419
aledesogusbusiness-hue wants to merge 1 commit into
antirez:mainfrom
aledesogusbusiness-hue:fix/v1-models-only-loaded

Conversation

@aledesogusbusiness-hue

Copy link
Copy Markdown

Problem

GET /v1/models always returns both deepseek-v4-flash and deepseek-v4-pro regardless of which GGUF is actually loaded at startup. GET /v1/models/<id> likewise accepts either alias.

This misleads OpenAI-compatible clients that inspect /v1/models to decide which models are available — they see two choices but the server always runs the single GGUF it was started with.

Fix

  • send_models(): replace the hardcoded two-entry list with a single call to server_model_id_from_engine(s->engine).
  • /v1/models/<id> handler: compare the requested ID against server_model_id_from_engine(s->engine) directly; return 404 for the non-loaded model.
  • Remove server_model_alias_known() which is now unused.

Test plan

  • Start server with a Flash GGUF → GET /v1/models returns only deepseek-v4-flash; GET /v1/models/deepseek-v4-pro returns 404.
  • Start server with a Pro GGUF → opposite behaviour.

Fixes #414

🤖 Generated with Claude Code

GET /v1/models was hardcoding both deepseek-v4-flash and deepseek-v4-pro
regardless of which GGUF is actually loaded.  GET /v1/models/<id> accepted
either alias for the same reason.

Use server_model_id_from_engine() at both sites so the list and the per-model
lookup reflect the single model loaded at startup.  Remove the now-unused
server_model_alias_known() helper.

Fixes: antirez#414

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fry69

fry69 commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

This is the same as #287?

@derogab

derogab commented Jun 16, 2026

Copy link
Copy Markdown

This is the same as #287?

Yes, I think it is. #287 was the one I proposed when I opened the related issue #404.

#419 is a smaller fix and looks OK for the endpoint behavior. #287 also includes the README update, a server unit test for Flash/Pro metadata, and an explicit id/name mapping so deepseek-v4-pro reports DeepSeek V4 Pro instead of incorrect name (DeepSeek V4 Flash).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/v1/models exposes models that are not actually loaded

3 participants