[Brev] fail to onboard local ollama

## Description

Description:
nemoclaw onboard failed when select Local Ollama

[Environment]
Device: Brev via command line
Node.js:v22.22.2
npm: 10.9.7
Docker: Docker version 29.1.3
NemoClaw: v0.0.24

[Steps to Reproduce]
<ol><li>install Nemoclaw: curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash</li><li>select "7) Local Ollama (localhost:11434) — running (suggested)" -> "2) nemotron-3-nano:30b"</li></ol>



[Expected Result]Nemoclaw should correctly onboard the local ollama



[Actual Result]
<pre> Ollama starter models:
 1) qwen2.5:7b
 2) nemotron-3-nano:30b
 3) Other...

 No local Ollama models are installed yet. Choose one to pull and load now.

 Choose model [1]: 2
 Pulling Ollama model: nemotron-3-nano:30b
[GIN] 2026/04/24 - 06:12:03 | 200 | 44.604µs | 127.0.0.1 | HEAD "/"
pulling manifest ⠼ time=2026-04-24T06:12:04.520Z level=INFO source=download.go:179 msg="downloading a70437c41b3b in 25 1 GB part(s)"
pulling manifest
pulling a70437c41b3b: 100% ▕█████████████████████████████████▏ 24 GB time=2026-04-24T06:12:45.721Z level=INFO source=download.go:179 msg="downloading bca58c750377 inpulling manifest
pulling a70437c41b3b: 100% ▕█████████████████████████████████▏ 24 GB
pulling bca58c750377: 100% ▕█████████████████████████████████▏ 10 KB tpulling manifest
pulling a70437c41b3b: 100% ▕█████████████████████████████████▏ 24 GB
pulling bca58c750377: 100% ▕█████████████████████████████████▏ 10 KB
pulling manifest
pulling a70437c41b3b: 100% ▕█████████████████████████████████▏ 24 GB
pulling manifest
pulling a70437c41b3b: 100% ▕█████████████████████████████████▏ 24 GB
pulling bca58c750377: 100% ▕█████████████████████████████████▏ 10 KB
pulling 12e88b2a8727: 100% ▕█████████████████████████████████▏ 28 B
pulling 12bee8c08a36: 100% ▕█████████████████████████████████▏ 488 B
verifying sha256 digest
writing manifest
success
 Loading Ollama model: nemotron-3-nano:30b
time=2026-04-24T06:13:06.904Z level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 34269"
time=2026-04-24T06:13:07.118Z level=INFO source=server.go:259 msg="enabling flash attention"
time=2026-04-24T06:13:07.119Z level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --model /home/shadeform/.ollama/models/blobs/sha256-a70437c41b3b0b768c48737e15f8160c90f13dc963f5226aabb3a160f708d1ce --port 37637"
time=2026-04-24T06:13:07.119Z level=INFO source=sched.go:484 msg="system memory" total="98.3 GiB" free="94.9 GiB" free_swap="4.0 GiB"
time=2026-04-24T06:13:07.119Z level=INFO source=sched.go:491 msg="gpu memory" id=GPU-c6ecd61c-e96f-8366-5a00-ba7ad810947d library=CUDA available="78.8 GiB" free="79.3 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-04-24T06:13:07.119Z level=INFO source=server.go:771 msg="loading model" "model layers"=53 requested=-1
time=2026-04-24T06:13:07.130Z level=INFO source=runner.go:1417 msg="starting ollama engine"
time=2026-04-24T06:13:07.130Z level=INFO source=runner.go:1452 msg="Server listening on 127.0.0.1:37637"
time=2026-04-24T06:13:07.141Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:262144 KvCacheType: NumThreads:14 GPULayers:53[ID:GPU-c6ecd61c-e96f-8366-5a00-ba7ad810947d Layers:53(0..52)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-24T06:13:07.166Z level=INFO source=ggml.go:136 msg="" architecture=nemotron_h_moe file_type=Q4_K_M name="NVIDIA Nemotron 3 Nano 30B A3B BF16" description="" num_tensors=401 num_key_values=118
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
 Device 0: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes, ID: GPU-c6ecd61c-e96f-8366-5a00-ba7ad810947d
load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so
time=2026-04-24T06:13:07.277Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,520,600,610,700,750,800,860,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
time=2026-04-24T06:13:07.976Z level=INFO source=runner.go:1290 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:262144 KvCacheType: NumThreads:14 GPULayers:53[ID:GPU-c6ecd61c-e96f-8366-5a00-ba7ad810947d Layers:53(0..52)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-24T06:13:08.795Z level=INFO source=runner.go:1290 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:262144 KvCacheType: NumThreads:14 GPULayers:53[ID:GPU-c6ecd61c-e96f-8366-5a00-ba7ad810947d Layers:53(0..52)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-24T06:13:08.795Z level=INFO source=ggml.go:482 msg="offloading 52 repeating layers to GPU"
time=2026-04-24T06:13:08.795Z level=INFO source=ggml.go:489 msg="offloading output layer to GPU"
time=2026-04-24T06:13:08.795Z level=INFO source=ggml.go:494 msg="offloaded 53/53 layers to GPU"
time=2026-04-24T06:13:08.795Z level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="22.4 GiB"
time=2026-04-24T06:13:08.795Z level=INFO source=device.go:245 msg="model weights" device=CPU size="231.0 MiB"
time=2026-04-24T06:13:08.795Z level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="2.7 GiB"
time=2026-04-24T06:13:08.795Z level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="842.5 MiB"
time=2026-04-24T06:13:08.795Z level=INFO source=device.go:267 msg="compute graph" device=CPU size="5.2 MiB"
time=2026-04-24T06:13:08.795Z level=INFO source=device.go:272 msg="total memory" size="26.1 GiB"
time=2026-04-24T06:13:08.795Z level=INFO source=sched.go:561 msg="loaded runners" count=1
time=2026-04-24T06:13:08.795Z level=INFO source=server.go:1364 msg="waiting for llama runner to start responding"
time=2026-04-24T06:13:08.795Z level=INFO source=server.go:1398 msg="waiting for server to become available" status="llm server loading model"
time=2026-04-24T06:13:12.558Z level=INFO source=server.go:1402 msg="llama runner started in 5.44 seconds"
[GIN] 2026/04/24 - 06:13:12 | 200 | 6.149578314s | 127.0.0.1 | POST "/api/generate"
[GIN] 2026/04/24 - 06:13:13 | 200 | 6.475856842s | 127.0.0.1 | POST "/api/generate"
[GIN] 2026/04/24 - 06:13:13 | 200 | 559.963362ms | 127.0.0.1 | POST "/v1/responses"
 Responses API available — OpenClaw will use openai-responses.
 ℹ Using chat completions API (Ollama tool calls require /v1/chat/completions)

 [4/8] Setting up inference provider
 ──────────────────────────────────────────────────
✓ Active gateway set to 'nemoclaw'
[GIN] 2026/04/24 - 06:13:13 | 200 | 227.473µs | 127.0.0.1 | GET "/api/tags"
 Local Ollama is responding on 127.0.0.1, but containers cannot reach the auth proxy at http://host.openshell.internal:11435. Ensure the Ollama auth proxy is running.
shadeform@brev-rsebf7yzk:~$ ps -elf | grep ollama
0 S shadefo+ 3938848 3851849 15 80 0 - 815477 futex_ 06:06 pts/1 00:01:33 ollama serve
0 S shadefo+ 3955115 1 0 80 0 - 253150 ep_pol 06:11 ? 00:00:00 /home/shadeform/.nvm/versions/node/v22.22.2/bin/node /home/shadeform/.nemoclaw/source/scripts/ollama-auth-proxy.js
0 S shadefo+ 3957711 3938848 7 80 0 - 59883714 futex_ 06:13 pts/1 00:00:12 /usr/local/bin/ollama runner --ollama-engine --model /home/shadeform/.ollama/models/blobs/sha256-a70437c41b3b0b768c48737e15f8160c90f13dc963f5226aabb3a160f708d1ce --port 37637
0 S shadefo+ 3964232 3851849 0 80 0 - 1653 pipe_r 06:16 pts/1 00:00:00 grep --color=auto ollama
shadeform@brev-rsebf7yzk:~$ ss -tlnp | grep 11435
LISTEN 0 511 0.0.0.0:11435 0.0.0.0:* users:(("node",pid=3955115,fd=21)</code></pre>

## Bug Details

| Field | Value |
|-------|-------|
| Priority | Unprioritized |
| Action | Dev - Open - To fix |
| Disposition | Open issue |
| Module | Machine Learning - NemoClaw |
| Keyword | NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard, NemoClaw-SWQA-RelBlckr-Recommended |

---
[NVB#6110214]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Brev] fail to onboard local ollama #2425

Description

Bug Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field	Value
Priority	Unprioritized
Action	Dev - Open - To fix
Disposition	Open issue
Module	Machine Learning - NemoClaw
Keyword	NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard, NemoClaw-SWQA-RelBlckr-Recommended

[Brev] fail to onboard local ollama #2425

Description

Description

Bug Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions