Skip to content

Add split buffer facility to support QNX PMEM and other buffering systems#72

Merged
dallison merged 39 commits into
mainfrom
qnx-pmem-buffers
May 13, 2026
Merged

Add split buffer facility to support QNX PMEM and other buffering systems#72
dallison merged 39 commits into
mainfrom
qnx-pmem-buffers

Conversation

@dallison
Copy link
Copy Markdown
Owner

@dallison dallison commented May 8, 2026

Summary

  • Add generic split-buffer channels that keep message prefixes separate from payload buffers, with client-side test coverage available via --use_split_buffers.
  • Allow buffer memory to be supplied through configurable allocation callbacks, so split buffers can be backed by alternate memory managers instead of the default mapping path.
  • Preserve and propagate generic client-buffer metadata through server and shadow state so split-buffer channels can recover and clean up reliably
  • Expand the C client API and tests for split buffers, checksum callbacks, message fields, channel introspection, callbacks, diagnostics, and invalid argument paths.
  • Add CI coverage for the C client in normal, split-buffer, ASAN, and Linux-only Valgrind modes.
  • Update Bazel, CMake, CI, Rust/C++ client behavior, and RPC request publishing paths for the split-buffer model.

Test plan

  • bazelisk test //...
  • bazelisk test //client:client_test //client:latency_test //client:stress_test --test_arg=--use_split_buffers
  • bazelisk test --config=gcc //...
  • bazelisk test --config=clang //...
  • bazelisk test --config=asan //client:client_test --test_arg=--use_split_buffers
  • bazelisk test //client:client_test --run_under='valgrind --error-exitcode=1 --errors-for-leak-kinds=definite' --test_timeout=600
  • bazelisk test //c_client:client_test
  • bazelisk test //c_client:client_test --test_arg=--use_split_buffers
  • bazelisk test --config=asan //c_client:client_test
  • valgrind --tool=memcheck --leak-check=full --show-leak-kinds=definite,indirect --errors-for-leak-kinds=definite,indirect --error-exitcode=99 bazel-bin/c_client/client_test
  • cargo test in the Rust crates touched by this PR
  • cmake -S . -B build/cmake -DCMAKE_BUILD_TYPE=Debug
  • cmake --build build/cmake --parallel $(nproc)
  • ctest --test-dir build/cmake --output-on-failure

Notes

  • CI now includes the normal matrix, explicit split-buffer client tests, C client diagnostics, Linux Valgrind, and CMake jobs.

dallison added 12 commits May 8, 2026 12:57
Gate QNX persistent-memory buffers behind build-time flags and register metadata with the server so channel teardown can clean up pmem objects.
Track QNX PMEM buffer metadata through client registration, server cleanup, and shadow replication so servers can recover and release client-allocated PMEM safely.
Bring in the merged CMake build fixes and resolve the plugin/test wiring against the QNX PMEM branch.
Expose custom PMEM allocation and mapping hooks so embedders can manage QNX PMEM buffers while preserving metadata needed for subscribers.
Simplify split-buffer setup so allocator-specific behavior is supplied by callbacks instead of protocol and option fields. Also record Bazelisk as the repository test runner for future agent work.
Allow test binaries to opt into split-buffer publishers by default and fix split-buffer multi-publisher attachment and prefix handling so client_test can run under that mode.
Keep the default test dimensions unchanged while reducing split-buffer-only sweeps that exceed common per-process mapping limits.
@dallison dallison changed the title Add QNX PMEM buffer support Add split buffer facility to support QNX PMEM and other buffering systems May 11, 2026
dallison added 14 commits May 11, 2026 11:08
Add explicit CI coverage for client_test, latency_test, and stress_test with split-buffer publishers enabled.
Avoid macOS file descriptor exhaustion when the split-buffer client test maps many publisher slot metadata files in one process.
Keep Linux split-buffer coverage unchanged while lowering macOS-only latency and stress shapes that exceed the runner file descriptor limit.
Reduce split-buffer latency message counts and further lower macOS mux dimensions so the new CI coverage fits runner limits.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

Optimized Latency Report

Parsed 2116 latency records from optimized latency runs.

Test Metric OS Revision Series Latest Value (ns)
PubSubLatency average linux baseline publish_and_read 2,691
PubSubLatency average linux pr publish_and_read 2,709
PubSubLatency average macos baseline publish_and_read 2,549
PubSubLatency average macos pr publish_and_read 2,694
PublisherLatencyHistogram average linux baseline no_retirement 222,001
PublisherLatencyHistogram average linux baseline with_retirement 2,363
PublisherLatencyHistogram average linux pr no_retirement 219,597
PublisherLatencyHistogram average linux pr with_retirement 2,349
PublisherLatencyHistogram average macos baseline no_retirement 554,572
PublisherLatencyHistogram average macos baseline with_retirement 1,203
PublisherLatencyHistogram average macos pr no_retirement 510,693
PublisherLatencyHistogram average macos pr with_retirement 1,137
PublisherLatencyHistogram max linux baseline no_retirement 398,999
PublisherLatencyHistogram max linux baseline with_retirement 73,027
PublisherLatencyHistogram max linux pr no_retirement 399,859
PublisherLatencyHistogram max linux pr with_retirement 72,366
PublisherLatencyHistogram max macos baseline no_retirement 40,328,333
PublisherLatencyHistogram max macos baseline with_retirement 177,917
PublisherLatencyHistogram max macos pr no_retirement 18,169,833
PublisherLatencyHistogram max macos pr with_retirement 429,583
PublisherLatencyHistogram median linux baseline no_retirement 220,935
PublisherLatencyHistogram median linux baseline with_retirement 2,114
PublisherLatencyHistogram median linux pr no_retirement 213,901
PublisherLatencyHistogram median linux pr with_retirement 2,084
PublisherLatencyHistogram median macos baseline no_retirement 443,958
PublisherLatencyHistogram median macos baseline with_retirement 875
PublisherLatencyHistogram median macos pr no_retirement 422,250
PublisherLatencyHistogram median macos pr with_retirement 875
PublisherLatencyHistogram min linux baseline no_retirement 208,381
PublisherLatencyHistogram min linux baseline with_retirement 1,492
PublisherLatencyHistogram min linux pr no_retirement 208,401
PublisherLatencyHistogram min linux pr with_retirement 1,533
PublisherLatencyHistogram min macos baseline no_retirement 220,666
PublisherLatencyHistogram min macos baseline with_retirement 625
PublisherLatencyHistogram min macos pr no_retirement 211,792
PublisherLatencyHistogram min macos pr with_retirement 666
PublisherLatencyHistogram p99 linux baseline no_retirement 275,757
PublisherLatencyHistogram p99 linux baseline with_retirement 6,052
PublisherLatencyHistogram p99 linux pr no_retirement 260,970
PublisherLatencyHistogram p99 linux pr with_retirement 6,031
PublisherLatencyHistogram p99 macos baseline no_retirement 2,037,541
PublisherLatencyHistogram p99 macos baseline with_retirement 7,625
PublisherLatencyHistogram p99 macos pr no_retirement 1,645,417
PublisherLatencyHistogram p99 macos pr with_retirement 4,291
SubscriberLatency average linux baseline read_messages 1,424
SubscriberLatency average linux pr read_messages 1,419
SubscriberLatency average macos baseline read_messages 1,907
SubscriberLatency average macos pr read_messages 1,619

Charts

SVG charts and raw JSONL are attached to the workflow run. The chart bundle is available as the latency-report artifact.

The old major-regression comparison table was removed because CI-hosted baseline comparisons were too noisy and no longer reflect the current CI policy. Current CI keeps the optimized latency smoke run, but does not publish or gate on the major-regression table.

dallison added 11 commits May 11, 2026 18:33
Keep the latency report tooling available for manual regression runs while avoiding noisy cloud benchmark comparisons in GitHub Actions.
Run the optimized latency test in CI without baseline comparison or PR regression reporting, keeping the check useful without noisy cloud deltas.
Add C API accessors for publisher/subscriber slot addresses and subscriber timeout waits so external clients can inspect mapped buffers directly.
Adds missing C client API wrappers and focused tests, including split-buffer, checksum callback, ASAN, and Valgrind CI coverage.
Mirrors generic split-buffer metadata through the shadow process and removes the obsolete PMEM-specific build surface.
Avoids a concurrent publisher attach race where mmap can see an object before the creator has finished sizing it.
Allows the lossy callback test to tolerate extra legitimate drops observed under macOS scheduling.
Avoids replacing shadow channel FDs during option updates and gives split-buffer tests unique shared-memory names under parallel runs.
Use mkstemp-backed socket paths for Rust server fixtures so repeated Bazel runs do not collide across parallel sandboxes.
@dallison dallison merged commit a5def2e into main May 13, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant