Skip to content

native module refinement#1875

Draft
jeff-hykin wants to merge 13 commits intodevfrom
jeff/fix/native_rebuild3
Draft

native module refinement#1875
jeff-hykin wants to merge 13 commits intodevfrom
jeff/fix/native_rebuild3

Conversation

@jeff-hykin
Copy link
Copy Markdown
Member

@jeff-hykin jeff-hykin commented Apr 17, 2026

Native module refinement (rebuild logic and logging) and keyboard interrupt improvement

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 17, 2026

Greptile Summary

This PR refines the native-module lifecycle with a stop-lock to prevent re-entrant shutdown races, orphan-process prevention via start_new_session=True + PR_SET_PDEATHSIG (Linux), a cli_name_override mapping, an auto_build / BUILD_NATIVE flag for on-demand rebuilds, and improved structured logging throughout. The worker process now ignores SIGINT (SIG_IGN) instead of catching KeyboardInterrupt, cleanly deferring Ctrl-C handling to the coordinator.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style/reliability suggestions that do not block correct operation.

Previously flagged P1 issues (TimeoutExpired after SIGKILL, re-entrant stop) are resolved. New findings are P2: a reduced shutdown timeout that may be too short for heavy-teardown processes, missing stderr in the build-failure exception, and a minor find_library edge case. None of these are blocking correctness bugs.

dimos/core/native_module.py — shutdown_timeout reduction and build-error message regression deserve a second look before shipping to users with slow-teardown native processes.

Important Files Changed

Filename Overview
dimos/core/native_module.py Major refactor: adds stop-lock for re-entrancy safety, start_new_session+PR_SET_PDEATHSIG for orphan prevention, cli_name_override, auto_build flag, and improved logging; shutdown_timeout silently reduced from 10 s → 2 s (DEFAULT_THREAD_JOIN_TIMEOUT).
dimos/core/coordination/python_worker.py Worker now ignores SIGINT via SIG_IGN and drops KeyboardInterrupt handlers, delegating interrupt handling to the parent; cleaner shutdown path.
dimos/core/global_config.py Adds build_native: bool config field (default False) sourced from DEFAULT_BUILD_NATIVE, allowing opt-in forced rebuilds via env var.
dimos/constants.py Adds DEFAULT_BUILD_NATIVE = False constant.
dimos/core/test_native_module.py Minor test cleanup: removes the explicit LogFormat.TEXT default (it is still the default) and drops the LogFormat import.

Sequence Diagram

sequenceDiagram
    participant Parent as NativeModule (parent)
    participant Proc as Native Process (child)
    participant WD as Watchdog Thread

    Parent->>Proc: Popen(start_new_session=True, preexec_fn=PR_SET_PDEATHSIG)
    Parent->>WD: Thread(target=_watch_process).start()
    WD->>Proc: proc.wait()

    alt Normal stop()
        Parent->>Parent: _stop_lock: _stopping=True
        Parent->>Proc: send_signal(SIGTERM)
        Proc-->>Parent: exits within shutdown_timeout
        Parent->>WD: watchdog.join(timeout)
        WD-->>Parent: joined
        Parent->>Parent: _stop_lock: _process=None, _watchdog=None
        Parent->>Parent: super().stop()
    else Process crashes
        Proc-->>WD: proc.wait() returns (unexpected exit)
        WD->>Parent: self.stop() [re-entrant, lock prevents double-stop]
    else Parent dies unexpectedly
        Proc->>Proc: SIGTERM via PR_SET_PDEATHSIG
    end
Loading

Reviews (3): Last reviewed commit: "-" | Re-trigger Greptile

Comment thread dimos/core/native_module.py Outdated
Comment thread dimos/core/native_module.py
Comment thread dimos/core/native_module.py
Comment thread dimos/core/native_module.py Outdated
Comment thread dimos/core/native_module.py Outdated
Comment thread dimos/core/native_module.py
Comment thread dimos/core/native_module.py
Comment thread dimos/core/native_module.py Outdated
Copy link
Copy Markdown
Contributor

@leshy leshy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some changes

Comment thread dimos/core/native_module.py Outdated
@jeff-hykin jeff-hykin mentioned this pull request Apr 17, 2026
6 tasks
jeff-hykin added a commit that referenced this pull request Apr 22, 2026
Document all unaddressed review comments across these PRs:
- PR #1875: 6 comments from leshy/paul-nechifor on native_module.py
- PR #1791: 8 comments from leshy on nav stack architecture/naming
- PR #1663: 3 comments from paul-nechifor on test_mcp_server.py
- PR #1485: 4 greptile bot suggestions (3 fixed, 1 minor typing nit)
jeff-hykin added a commit that referenced this pull request Apr 22, 2026
Keep both: dev's _WorkerState pattern and our branch's SIGINT ignore
for clean coordinator-orchestrated shutdown.
Also add pr_responses.yaml for PR #1875 review comments.
jeff-hykin and others added 4 commits April 22, 2026 17:37
Replace the ad-hoc PROD env var check with a proper GlobalConfig field,
readable via global_config.prod. pydantic-settings auto-reads from the
PROD env var.
Co-authored-by: Paul Nechifor <paul@nechifor.net>
- Restore LogFormat enum and JSON log parsing in _read_log_stream
- Add auto_build config to gate expensive nix rebuilds per-module
- Replace PROD env var with global_config.prod
- Handle TimeoutExpired after SIGKILL in stop()
- Use find_library("c") instead of hardcoded libc.so.6
- Simplify _child_preexec_linux docstring
- Document shell=True trusted-source requirement
@jeff-hykin jeff-hykin force-pushed the jeff/fix/native_rebuild3 branch from d7404f9 to 6b07823 Compare April 23, 2026 18:35
@jeff-hykin jeff-hykin enabled auto-merge (squash) April 24, 2026 06:33
@jeff-hykin jeff-hykin requested a review from leshy April 24, 2026 06:33
Comment thread dimos/core/global_config.py Outdated
@jeff-hykin jeff-hykin marked this pull request as draft April 28, 2026 13:48
auto-merge was automatically disabled April 28, 2026 13:48

Pull request was converted to draft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants