Linux: gate cblas dgemv on USE_OPENBLAS, autodetect via pkg-config by oaustegard · Pull Request #4 · Percepta-Core/transformer-vm

oaustegard · 2026-04-26T13:54:03Z

Summary

The matvec dispatch in transformer.cpp guards cblas_dgemv on __APPLE__, so on Linux the dense projection path falls through to a hand-rolled scalar nested loop even when libopenblas-dev is installed. runner.py's Linux compile invocation also doesn't link any BLAS, so there's no opt-in short of editing both files.

This PR adds an explicit USE_OPENBLAS macro for the matvec dispatch and extends the Linux branch of _build_cpp_engine to detect openblas via pkg-config, defining USE_OPENBLAS and appending the cflags/libs when found. Silent fallback to the scalar loop when either pkg-config or openblas-dev are unavailable, so existing Linux builds without libopenblas behave identically.

Why

I forked the repo to build a comparison artifact for an unrelated experiment, and noticed the dense BLAS path was never exercised on my Linux sandbox even with libopenblas-dev installed. Without it, "dense BLAS vs sparse" comparisons on Linux are misleading — the "BLAS" path is actually scalar.

Scope

Two files, 13 insertions, 1 deletion. macOS Accelerate path unchanged.

 transformer_vm/model/transformer.cpp |  4 +++-
 transformer_vm/runner.py             | 10 ++++++++++

Test plan

Linux without libopenblas-dev: pkg-config returns non-zero, no -DUSE_OPENBLAS added, scalar fallback used (verified in sandbox).
Linux with libopenblas-dev: pkg-config --cflags --libs openblas returns flags, -DUSE_OPENBLAS defined, cblas_dgemv linked.
macOS path is byte-identical (the __APPLE__ branch in the preprocessor and the Darwin branch in runner.py are untouched).
Token output is bit-identical between the scalar and openblas paths on hello, addition, collatz, fibonacci, min_cost_matching (no FP-rounding divergence at these scales).

Notes

This is intentionally minimal and orthogonal to the larger perf work in #1 and #2. Happy to fold into either if you'd prefer; filed standalone since it's a small correctness fix that's useful regardless of how those land.

The matvec dispatch in transformer.cpp guards cblas_dgemv on __APPLE__, so on Linux the dense projection path falls through to a hand-rolled scalar nested loop regardless of whether libopenblas is installed. runner.py's Linux compile invocation also doesn't link any BLAS, so there's no way to opt in short of editing both files. This adds an explicit USE_OPENBLAS macro for the matvec dispatch and extends the Linux branch of _build_cpp_engine to detect openblas via pkg-config, defining USE_OPENBLAS and adding the cflags/libs when found. Falls back silently to the scalar loop when pkg-config or openblas-dev are unavailable, so existing builds without libopenblas behave identically to before. macOS Accelerate path is unchanged.

ryvn-technologies · 2026-04-26T13:54:06Z

Ryvn Preview

Creating preview prerelease-Percepta-Core-transformer-vm for this pull request.

_{This comment will be automatically updated with preview details.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linux: gate cblas dgemv on USE_OPENBLAS, autodetect via pkg-config#4

Linux: gate cblas dgemv on USE_OPENBLAS, autodetect via pkg-config#4
oaustegard wants to merge 1 commit intoPercepta-Core:mainfrom
oaustegard:add-linux-openblas

oaustegard commented Apr 26, 2026

Uh oh!

ryvn-technologies Bot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oaustegard commented Apr 26, 2026

Summary

Why

Scope

Test plan

Notes

Uh oh!

ryvn-technologies Bot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant