Skip to content

net: adopt a shared_ptr<CNode> connection-lifetime model#3076

Open
PrestackI wants to merge 2 commits into
gridcoin-community:developmentfrom
PrestackI:net/shared-ptr-cnode
Open

net: adopt a shared_ptr<CNode> connection-lifetime model#3076
PrestackI wants to merge 2 commits into
gridcoin-community:developmentfrom
PrestackI:net/shared-ptr-cnode

Conversation

@PrestackI

Copy link
Copy Markdown
Contributor

Closes #3066.

Background

The #2558 stack moved the connection list into CConnman but deliberately kept the
inherited manual lifetime model: CNode lifetime was managed by AddRef()/Release()
on an atomic nRefCount, and the ThreadSocketHandler2 reaper moved disconnected
nodes into a local std::list<CNode*> and delete pnode once the refcount and the
per-node locks cleared. Now that CConnman owns the storage, this adopts ownership via
std::shared_ptr<CNode> and retires the by-convention refcount foot-gun.

Change

  • m_nodes becomes std::vector<std::shared_ptr<CNode>> — it holds the sole persistent
    owning reference.
  • ConnectNode returns std::shared_ptr<CNode> so OpenNetworkConnection and the addnode
    "onetry" RPC keep the node alive while they finish wiring it up, preserving the
    protection the old pre-publish AddRef() provided.
  • The socket-handler and message-handler service passes snapshot m_nodes by copying the
    shared_ptr vector; the copies keep nodes alive for the pass and are released when the
    snapshot leaves scope, replacing the AddRef/Release loops.
  • The reaper moves a disconnected node's shared_ptr into the disconnected pool and deletes
    it once use_count() == 1 (only the pool references it) and the per-node locks are free,
    replacing the explicit delete pnode.

Behavior-preserving note

The former empty-buffer auto-disconnect path is dropped: every connected node held a
permanent creation reference, so the GetRefCount() <= 0 gate that path required never
held, making it dead code. Disconnection has always been driven by fDisconnect alone,
which is preserved.

Risk / testing

MEDIUM — touches the hot connect/accept/reap path. Validated by a clean full local build
and all four fork cmake_* checks + Functional Tests green. The borrowing API
(FindNode/ForEachNode/etc.) keeps raw CNode* parameters; no caller stores a CNode*
past lock release and there are no CNode*-keyed maps, so the change stays contained to
src/net.h + src/net.cpp (and one ConnectNode caller in rpc/net.cpp).

CNode lifetime was managed by a manual AddRef/Release refcount (CNode::nRefCount)
with the ThreadSocketHandler2 reaper deleting a node only once its refcount hit
zero and its per-node locks cleared. Now that CConnman owns the node list, retire
that by-convention refcount in favour of ownership via std::shared_ptr.

- m_nodes becomes std::vector<std::shared_ptr<CNode>>; it holds the sole
  persistent owning reference.
- ConnectNode returns std::shared_ptr<CNode> so OpenNetworkConnection and the
  addnode "onetry" RPC keep the node alive while they finish wiring it up,
  preserving the protection the old pre-publish AddRef() gave.
- The socket-handler and message-handler service passes snapshot m_nodes by
  copying the shared_ptr vector; the copies keep nodes alive for the pass and
  are released when the snapshot leaves scope, replacing the AddRef/Release loops.
- The reaper moves a disconnected node's shared_ptr into vNodesDisconnected and
  deletes it once use_count() == 1 (only the disconnected pool references it) and
  the per-node locks are free, replacing the explicit `delete pnode`.

The former empty-buffer auto-disconnect path is dropped: every connected node
held a permanent creation reference, so the GetRefCount() <= 0 gate it required
never held, making it dead code. Disconnection has always been driven by
fDisconnect alone, which is preserved.

Closes gridcoin-community#3066.
@PrestackI PrestackI marked this pull request as ready for review June 17, 2026 02:41
When a ctest case fails in the distro/native build path, ctest buffers the
test's stdout into build/Testing/Temporary/LastTest.log and prints nothing
to the console, so functional-test flakes (e.g. rpc_multisig.py) show only
"1/3 Test #1: functional_tests ...***Failed" with no failing script name,
traceback, or combined node logs. test_runner.py already runs with
--ci --quiet --combinedlogslen=4000, so the detail exists — it's just
swallowed.

Add --output-on-failure to all four ctest invocations (native, depends,
win64, macos), mirroring what the dedicated functional job already does
(cmake_functional.yml). The flag is additive: no change on success, full
captured output on failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

net: adopt shared_ptr<CNode> connection-lifetime model (follow-up to #2558)

1 participant