Skip to content

Fix CLOSE_WAIT socket leak in monitor's close_mysql() when pvio is NULL#5854

Open
atifceylan wants to merge 2 commits into
sysown:v3.0from
atifceylan:fix/monitor-close-wait-socket-leak
Open

Fix CLOSE_WAIT socket leak in monitor's close_mysql() when pvio is NULL#5854
atifceylan wants to merge 2 commits into
sysown:v3.0from
atifceylan:fix/monitor-close-wait-socket-leak

Conversation

@atifceylan

@atifceylan atifceylan commented Jun 13, 2026

Copy link
Copy Markdown

When a backend server is removed (e.g. an AWS Aurora read replica is deleted) the remote end sends a TCP FIN. The MariaDB connector processes this FIN during the next async I/O operation and clears net.pvio before ProxySQL gets a chance to call close_mysql().

close_mysql() only sends COM_QUIT (which triggers end_server() -> close(fd) inside mysql_close_no_command()) when net.pvio != NULL. With pvio already NULL, mysql_close_no_command() skips end_server() entirely, so the kernel-level TCP socket is never closed. The socket remains in CLOSE_WAIT indefinitely.

The monitor threads (ping, read_only, connect) retry every mysql-monitor_ping_interval ms. Each failed retry creates a new connection attempt; when that attempt also fails with pvio=NULL the same path is taken, leaving one more CLOSE_WAIT socket behind. Over time the count grows steadily until proxysql is restarted, at which point all file descriptors are released.

Fix: in close_mysql(), add an else-if branch that detects the pvio=NULL / net.fd>0 state and explicitly calls shutdown(SHUT_RDWR) + close(fd) before delegating to mysql_close_no_command(). This mirrors what end_server() would have done and guarantees the kernel socket is released regardless of whether the MariaDB connector has already cleared pvio.

Reproducer:

  1. Configure ProxySQL with AWS Aurora MySQL (writer + 2 read replicas).
  2. Monitor socket count: watch -n5 'grep "^TCP:" /proc/net/sockstat'
  3. Delete one read replica from the Aurora console.
  4. Observe TCP socket count growing by ~1 per monitor_ping_interval.
  5. proxysql restart -> count returns to baseline.

After this fix the count stays flat after replica deletion.

Changes in v2

  • Rebased onto v3.0
  • Extended the fix to MySQL_Connection::close_mysql() in lib/mysql_connection.cpp (session path covers client connections, addresses #2908-style reports)
  • Replaced fd = 0 with INVALID_SOCKET
  • Trimmed block comment to a single line per style feedback
  • Dropped redundant shutdown() call (socket already in CLOSE_WAIT; close() alone suffices)

TAP regression test is in progress.

Related issues: #2908, #3022

Summary by CodeRabbit

  • Bug Fixes
    • Fixed a potential socket file-descriptor leak during MySQL connection shutdown that could cause connections to accumulate in a CLOSE_WAIT state.
    • Improved teardown handling when the underlying I/O layer is already cleared, ensuring the remaining socket is fully closed.
  • Tests
    • Added a regression test to verify the shutdown behavior and prevent reintroducing the CLOSE_WAIT leak on Linux.

@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR closes and invalidates socket file descriptors in two close paths when pvio is already cleared, and adds a regression test that reproduces and checks the CLOSE_WAIT leak scenario.

Changes

Close-path FD leak prevention

Layer / File(s) Summary
MySQL_Monitor close path documentation and FD cleanup
lib/MySQL_Monitor.cpp
Adds comments for the non-SSL shutdown sequence and closes my->net.fd when pvio is cleared but the FD remains open.
MySQL_Connection close path FD cleanup
lib/mysql_connection.cpp
Adds a fallback branch in MySQL_Connection::close_mysql() that closes mysql->net.fd and marks it invalid when pvio has already been cleared.
CLOSE_WAIT regression test
test/tap/tests/reg_test_5854_close_wait_leak-t.cpp
Adds a TAP regression test that reproduces the cleared-pvio state, checks the unfixed leak, checks the fixed path, and verifies repeated closes do not grow CLOSE_WAIT sockets.

Estimated code review effort: 2 (Simple) | ~10 minutes

Poem

A rabbit hops through socket light,
Then shuts the FD just right.
When pvio fades, no leak remains,
No CLOSE_WAIT lingers in the lanes,
A tidy close, a cheerful sight. 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main fix: preventing a CLOSE_WAIT socket leak in close_mysql() when pvio is NULL.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces changes to prevent CLOSE_WAIT leaks in both lib/MySQL_Monitor.cpp and lib/mysql_connection.cpp by directly closing the file descriptor when the pvio structure has already been cleared. However, a critical bug was identified in lib/MySQL_Monitor.cpp where the else if condition lacks the !my->net.pvio check. This omission can trigger a double-close of the file descriptor when SSL is enabled, potentially causing silent file descriptor corruption in multi-threaded environments. It is recommended to update the condition to match the correct implementation in lib/mysql_connection.cpp.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread lib/MySQL_Monitor.cpp Outdated
#endif
fd+=wb; // dummy, to make compiler happy
fd-=wb; // dummy, to make compiler happy
} else if (my->net.fd > 0) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

In lib/MySQL_Monitor.cpp, the else if condition lacks the !my->net.pvio check. If my->net.pvio is valid but my->options.use_ssl is true, the if condition if (my->net.pvio && !my->options.use_ssl) evaluates to false, and the code falls into this else if block.

This causes the file descriptor to be closed directly via close(my->net.fd) while my->net.pvio is still active. Subsequently, mysql_close_no_command(my) is called, which triggers end_server(my) -> vio_delete(my->net.pvio). This leads to a double-close of the file descriptor, which is a critical bug in multi-threaded programs as it can silently close a newly opened file descriptor belonging to another thread.

Please update the condition to match the correct implementation in lib/mysql_connection.cpp by adding && !my->net.pvio.

	} else if (my->net.fd > 0 && !my->net.pvio) {

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed fd > 0 to fd != INVALID_SOCKET in both paths (covers fd 0 edge case, per CodeRabbit)

@renecannao

Copy link
Copy Markdown
Contributor

Thank you!
Please let me know if you need help with a regression test.
At time (like probably in this case), the testing is a lie more complex than the fix itself

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/MySQL_Monitor.cpp`:
- Around line 208-211: The condition that checks socket validity uses
"my->net.fd > 0" which misses descriptor 0; change the branch to compare against
INVALID_SOCKET (e.g., "if (my->net.fd != INVALID_SOCKET)") so the fallback path
that closes the FD runs for FD 0 as well, then call close(my->net.fd) and set
my->net.fd = INVALID_SOCKET (keeping the existing comment about pvio already
cleared).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cfadfdb1-b57b-414c-a858-4eaf5d4c4b76

📥 Commits

Reviewing files that changed from the base of the PR and between 6a33b85 and 2beb829.

📒 Files selected for processing (2)
  • lib/MySQL_Monitor.cpp
  • lib/mysql_connection.cpp
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Feature tiers are controlled via flags: PROXYSQL31=1 for v3.1.x features (FFTO, TSDB), PROXYSQL40=1 for v4.0.x features (plugin loader). PROXYSQL40=1 implies both PROXYSQL31=1 and PROXYSQLFFTO=1 and PROXYSQLTSDB=1. Use conditional compilation with #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE.
Class names use PascalCase with protocol prefixes: MySQL_, PgSQL_, or ProxySQL_ (e.g., MySQL_Protocol, PgSQL_Session).
Member variables use snake_case.
Constants and macros use UPPER_SNAKE_CASE.
Use C++17; conditional compilation for feature tiers via #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE.
Use RAII for resource management; use jemalloc for memory allocation.
Use pthread mutexes for synchronization; use std::atomic<> for counters.

Files:

  • lib/mysql_connection.cpp
  • lib/MySQL_Monitor.cpp
{lib,src}/**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

GenAI/MCP/RAG/LLM features live entirely in plugins/genai/ and load as a .so at runtime via dlopen. Do not guard with PROXYSQLGENAI in core code — that flag no longer guards any core code as of Step 7 of the GenAI plugin carve-out.

Files:

  • lib/mysql_connection.cpp
  • lib/MySQL_Monitor.cpp
🔇 Additional comments (1)
lib/mysql_connection.cpp (1)

3022-3026: LGTM!

Comment thread lib/MySQL_Monitor.cpp Outdated
@atifceylan atifceylan force-pushed the fix/monitor-close-wait-socket-leak branch from 2beb829 to 4c445dc Compare June 13, 2026 17:01
@atifceylan

Copy link
Copy Markdown
Author

Thank you! Please let me know if you need help with a regression test. At time (like probably in this case), the testing is a lie more complex than the fix itself

Thanks! I'm working on the regression test. Taking reg_test_unexp_ping_pkt-t.cpp as a reference for structure and conventions — the plan is to connect directly to the MySQL backend, simulate the pvio=NULL / net.fd>0 state by calling shutdown(SHUT_RD) from outside the connector to trigger an EOF (which causes the connector to clear pvio), then verify that the file descriptor count does not grow after close_mysql() is called. Will push it once it compiles and passes locally.

When a backend server is removed (e.g. an AWS Aurora read replica is deleted)
the remote end sends a TCP FIN. The MariaDB connector processes this FIN during
the next async I/O operation and clears net.pvio before ProxySQL gets a chance
to call close_mysql().

close_mysql() only sends COM_QUIT (which triggers end_server() -> close(fd)
inside mysql_close_no_command()) when net.pvio != NULL.  With pvio already NULL,
mysql_close_no_command() skips end_server() entirely, so the kernel-level TCP
socket is never closed.  The socket remains in CLOSE_WAIT indefinitely.

The monitor threads (ping, read_only, connect) retry every
mysql-monitor_ping_interval ms.  Each failed retry creates a new connection
attempt; when that attempt also fails with pvio=NULL the same path is taken,
leaving one more CLOSE_WAIT socket behind.  Over time the count grows steadily
until proxysql is restarted, at which point all file descriptors are released.

Fix: in close_mysql(), add an else-if branch that detects the pvio=NULL /
net.fd>0 state and explicitly calls shutdown(SHUT_RDWR) + close(fd) before
delegating to mysql_close_no_command().  This mirrors what end_server() would
have done and guarantees the kernel socket is released regardless of whether
the MariaDB connector has already cleared pvio.

Reproducer:
  1. Configure ProxySQL with AWS Aurora MySQL (writer + 2 read replicas).
  2. Monitor socket count: watch -n5 'grep "^TCP:" /proc/net/sockstat'
  3. Delete one read replica from the Aurora console.
  4. Observe TCP socket count growing by ~1 per monitor_ping_interval.
  5. proxysql restart -> count returns to baseline.

After this fix the count stays flat after replica deletion.

Related issues: sysown#2908, sysown#3022
@atifceylan atifceylan force-pushed the fix/monitor-close-wait-socket-leak branch from 4c445dc to 1c49c8d Compare June 13, 2026 19:57
Adds reg_test_5854_close_wait_leak-t.cpp covering the pvio=NULL / net.fd>0
state that the MariaDB connector reaches when a backend sends a TCP FIN
during async I/O.

- test_unfixed_path_leaks_fd: mysql_close_no_command() skips end_server()
  when pvio=NULL, so the fd is never released (fd count stays above
  baseline) — demonstrates the bug.
- test_fixed_path_closes_fd: the explicit close(fd) added in close_mysql()
  releases the fd (count returns to baseline) — demonstrates the fix.
- test_no_close_wait_growth (Linux only): repeated pvio=NULL closes do not
  accumulate CLOSE_WAIT sockets.

Connects directly over TCP (MYSQL_OPT_PROTOCOL) and reads /proc/self/fd and
/proc/net/tcp for measurement; fd/CLOSE_WAIT checks are guarded so the test
skips cleanly on non-Linux hosts.

Related issues: sysown#2908, sysown#3022

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/tap/tests/reg_test_5854_close_wait_leak-t.cpp (1)

1-232: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Register this TAP test in test/tap/groups/groups.json.
reg_test_5854_close_wait_leak-t.cpp is not listed, so it won’t be picked up by the TAP test groups.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/reg_test_5854_close_wait_leak-t.cpp` around lines 1 - 232, The
new TAP regression test is implemented in reg_test_5854_close_wait_leak-t.cpp
but is not included in the TAP group registry, so it will not run. Add this test
to the appropriate entry in groups.json so the TAP harness discovers it, using
the reg_test_5854_close_wait_leak-t.cpp filename as the identifier to locate it.

Source: Coding guidelines

🧹 Nitpick comments (2)
test/tap/tests/reg_test_5854_close_wait_leak-t.cpp (2)

186-213: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low value

Fixed usleep(100ms) per iteration is a flakiness risk on loaded CI hosts.

The reclaim delay is a magic constant with only a +1 slack for CLOSE_WAIT growth. On a busy/throttled CI runner, kernel socket teardown could still be pending after 100ms, causing sporadic false failures.

♻️ Possible refactor: poll with a timeout instead of a fixed sleep
-		mysql_close_no_command(my);
-		usleep(100000); // 100 ms — let kernel reclaim
+		mysql_close_no_command(my);
 	}
+	// Poll for CLOSE_WAIT count to settle instead of a single fixed sleep
+	for (int waited = 0; waited < 2000; waited += 100) {
+		if (count_close_wait() <= cw_before + 1) break;
+		usleep(100000);
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/reg_test_5854_close_wait_leak-t.cpp` around lines 186 - 213,
The fixed 100ms usleep in the close-wait leak test is too brittle for loaded CI
runners. Update the loop in reg_test_5854_close_wait_leak-t.cpp to poll
count_close_wait() with a bounded timeout until the CLOSE_WAIT count stabilizes
or a limit is reached, and use that result before asserting. Keep the existing
mysql_close_no_command and make_pvio_null_conn flow, but replace the magic sleep
with a retry/wait helper so the test in the fixed-path block is resilient to
slow kernel teardown.

139-177: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy lift

Test reimplements the fix rather than exercising the actual close_mysql() code paths.

Tests 2 and 3 manually duplicate the else-if (pvio==NULL && fd>0) { close(fd); fd=-1; } logic (lines 164-167, 198-201) instead of invoking the actual close_mysql() in lib/MySQL_Monitor.cpp or MySQL_Connection::close_mysql() in lib/mysql_connection.cpp. This validates the general strategy but doesn't guard against regressions if those specific implementations diverge (e.g., wrong field checked, ordering, an accidental omission) from what's duplicated here.

Given close_mysql() in MySQL_Monitor.cpp is static and MySQL_Connection::close_mysql() needs a full session/monitor context, directly invoking them from a standalone TAP test may not be straightforward — but consider whether either function could be made testable (e.g., non-static, or a minimal MySQL_Connection fixture) in a follow-up so the regression test exercises the real code instead of a re-implementation.

Also applies to: 180-214

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/reg_test_5854_close_wait_leak-t.cpp` around lines 139 - 177,
The test currently duplicates the fixed close logic instead of exercising the
real implementation, so update the TAP coverage to call the actual close path
used by the product code. In test_fixed_path_closes_fd and the corresponding
leak test, prefer invoking close_mysql() from MySQL_Monitor.cpp or
MySQL_Connection::close_mysql() from lib/mysql_connection.cpp through a minimal
testable fixture/context rather than manually closing fd and clearing net.fd. If
those symbols are not directly accessible, make the relevant close path testable
in a follow-up so this regression test verifies the real code path instead of a
reimplementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@test/tap/tests/reg_test_5854_close_wait_leak-t.cpp`:
- Around line 1-232: The new TAP regression test is implemented in
reg_test_5854_close_wait_leak-t.cpp but is not included in the TAP group
registry, so it will not run. Add this test to the appropriate entry in
groups.json so the TAP harness discovers it, using the
reg_test_5854_close_wait_leak-t.cpp filename as the identifier to locate it.

---

Nitpick comments:
In `@test/tap/tests/reg_test_5854_close_wait_leak-t.cpp`:
- Around line 186-213: The fixed 100ms usleep in the close-wait leak test is too
brittle for loaded CI runners. Update the loop in
reg_test_5854_close_wait_leak-t.cpp to poll count_close_wait() with a bounded
timeout until the CLOSE_WAIT count stabilizes or a limit is reached, and use
that result before asserting. Keep the existing mysql_close_no_command and
make_pvio_null_conn flow, but replace the magic sleep with a retry/wait helper
so the test in the fixed-path block is resilient to slow kernel teardown.
- Around line 139-177: The test currently duplicates the fixed close logic
instead of exercising the real implementation, so update the TAP coverage to
call the actual close path used by the product code. In
test_fixed_path_closes_fd and the corresponding leak test, prefer invoking
close_mysql() from MySQL_Monitor.cpp or MySQL_Connection::close_mysql() from
lib/mysql_connection.cpp through a minimal testable fixture/context rather than
manually closing fd and clearing net.fd. If those symbols are not directly
accessible, make the relevant close path testable in a follow-up so this
regression test verifies the real code path instead of a reimplementation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a1144bcb-9048-4374-844a-560392af2ab4

📥 Commits

Reviewing files that changed from the base of the PR and between 1c49c8d and 716a13d.

📒 Files selected for processing (1)
  • test/tap/tests/reg_test_5854_close_wait_leak-t.cpp
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Feature tiers are controlled via flags: PROXYSQL31=1 for v3.1.x features (FFTO, TSDB), PROXYSQL40=1 for v4.0.x features (plugin loader). PROXYSQL40=1 implies both PROXYSQL31=1 and PROXYSQLFFTO=1 and PROXYSQLTSDB=1. Use conditional compilation with #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE.
Class names use PascalCase with protocol prefixes: MySQL_, PgSQL_, or ProxySQL_ (e.g., MySQL_Protocol, PgSQL_Session).
Member variables use snake_case.
Constants and macros use UPPER_SNAKE_CASE.
Use C++17; conditional compilation for feature tiers via #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE.
Use RAII for resource management; use jemalloc for memory allocation.
Use pthread mutexes for synchronization; use std::atomic<> for counters.

Files:

  • test/tap/tests/reg_test_5854_close_wait_leak-t.cpp
test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/. Test binaries are built via pattern rule make <testname>-t which compiles <testname>-t.cpp into <testname>-t. Register new tests in groups.json.

Files:

  • test/tap/tests/reg_test_5854_close_wait_leak-t.cpp
🧠 Learnings (1)
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

  • test/tap/tests/reg_test_5854_close_wait_leak-t.cpp
🔇 Additional comments (2)
test/tap/tests/reg_test_5854_close_wait_leak-t.cpp (2)

34-47: LGTM!

Also applies to: 52-68


128-128: 🎯 Functional Correctness

mysql_close_no_command is available to this TAP test via <mysql.h>.
No change needed.

@atifceylan

Copy link
Copy Markdown
Author

Regression test

Test: test/tap/tests/reg_test_5854_close_wait_leak-t.cpp
Build: make -C test/tap/tests reg_test_5854_close_wait_leak-t
Run: TAP_MYSQLHOST=127.0.0.1 TAP_MYSQLPORT=3306 TAP_MYSQLUSERNAME=root TAP_MYSQLPASSWORD= ./reg_test_5854_close_wait_leak-t

1..3
# --- test_unfixed_path_leaks_fd ---
# fds baseline (before connect): 3
# fds after connect: 4
# fds after unfixed close: 4
ok 1 - unfixed path: fd not released — count stays above baseline after mysql_close_no_command on pvio=NULL conn (baseline=3 after=4)
# --- test_fixed_path_closes_fd ---
# fds baseline (before connect): 4
# fds after connect: 5
# fds after fixed close: 4
ok 2 - fixed path: fd released — count returns to baseline after explicit close(fd)+mysql_close_no_command on pvio=NULL conn (baseline=4 after=4)
# --- test_no_close_wait_growth ---
# CLOSE_WAIT before: 0
# CLOSE_WAIT after 5 iters: 0
ok 3 - fixed path: CLOSE_WAIT count must not grow after 5 pvio=NULL closes (before=0 after=0)
Test took 0.51 sec

The test reproduces the pvio=NULL / net.fd>0 state that the MariaDB connector reaches when a backend sends a TCP FIN during async I/O:

  • Test 1 (test_unfixed_path_leaks_fd) — with pvio=NULL, mysql_close_no_command() skips end_server(), so the fd is never released: count stays at 4, above the baseline of 3. Confirms the bug.
  • Test 2 (test_fixed_path_closes_fd) — the explicit close(fd) added in close_mysql() by this PR releases the fd: count returns to baseline (4). Confirms the fix.
  • Test 3 (test_no_close_wait_growth, Linux only) — 5 repeated pvio=NULL closes leave the CLOSE_WAIT count flat (0 → 0).

fd/CLOSE_WAIT measurement uses /proc/self/fd and /proc/net/tcp, guarded so the test skips cleanly on non-Linux hosts.

@sonarqubecloud

sonarqubecloud Bot commented Jul 2, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants