Skip to content

Misc. bug: Tools doesn't work in WebUI when running in router mode #24992

Description

@SlavikCA

Name and Version

ghcr.io/ggml-org/llama.cpp:server-cuda12-b9776

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

services:
  llama-router:
    image: ghcr.io/ggml-org/llama.cpp:server-cuda12-b9776
    container_name: router
    devices:
      - "nvidia.com/gpu=all"
    ports:
      - "8080:8080"
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    volumes:
      - /home/slavik/.cache:/root/.cache
      - ./models.ini:/app/models.ini:ro
    entrypoint: ["./llama-server"]
    command: >
      --models-max 1
      --models-preset ./models.ini
      --host 0.0.0.0  --port 8080
; models.ini 
version = 1

[local-vl-qwen27B]
ctx-size=262144
temp=0.6
top-p=0.95
top-k=20
min-p=0.00
mmproj=/root/.cache/llama.cpp/05353347512982ee62317b9d8c89372bc815f4b4043580e7ef3ad411ec1a1cd3
model=/root/.cache/llama.cpp/f3b4a622e06e8ade06ec5c0eb9b40ed7c9bd707b5fada46c0215f4ab4a6bc32b 
fit=off
gpu-layers=all
gpu-layers-draft=all
kv-unified=1
tools=all
spec-type=draft-mtp

Problem description & steps to reproduce

When llama-server started in the router mode (example of docker compose above), then WebUI has no built-in tools, and even if I enable run_javascript - it doesn't work:

Image

But if I start it NOT in the router mode, then I see Built-in tools and run_javascript and they're working, as expected.

sudo docker run -d --rm --name qwen27B \
  --device=nvidia.com/gpu=all \
  -e NVIDIA_VISIBLE_DEVICES=all \
  -e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
  -v /home/slavik/.cache:/root/.cache \
  -p 8080:8080 \
  ghcr.io/ggml-org/llama.cpp:server-cuda12-b9776 --host 0.0.0.0 --port 8080  --mmproj /root/.cache/llama.cpp/05353347512982ee62317b9d8c89372bc815f4b4043580e7ef3ad411ec1a1cd3 --model /root/.cache/llama.cpp/f3b4a622e06e8ade06ec5c0eb9b40ed7c9bd707b5fada46c0215f4ab4a6bc32b --alias local-vl-qwen27B --fit off --gpu-layers all --gpu-layers-draft all   --ctx-size 262144 --kv-unified  --top-p 0.95 --top-k 20 --temp 0.6 --min-p 0.00 --tools all --spec-type draft-mtp
Image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions