Fix max_connections cap violation in pool matcher#5769
Conversation
MyHGC::get_random_MySrvC counted only ConnectionsUsed, letting alive exceed max_connections when combined with the case-1 leak. Count Used+Free, falling through when Free>0 so a free conn can be reused or swapped.
… cap In MySrvConnList::get_random_MyConn case-1, when create_new_connection fired the new conn was added to Used while the misfit was left in Free, pushing alive past max_connections. Delete the misfit before allocating when Used+Free >= max; below the cap leave it in Free for a possible later quality-3 reuse.
Mirror of the MySQL fix on the PgSQL path: - get_random_MySrvC counted only ConnectionsUsed against max_connections; count Used+Free, falling through when Free>0. - get_random_MyConn case-1 with create_new_connection allocated without removing the misfit free conn, pushing alive past max. Delete the misfit before allocating when alive >= max. PGOPTIONS / SET on a HIGH_WM tracked variable produces the quality-1 mismatch that drives this path.
📝 WalkthroughWalkthroughMySQL and PgSQL connection pool allocation logic now enforces the server's maximum connection limit by computing total alive connections (used plus free) and removing a selected free connection from the pool if the count would meet or exceed the cap before allocating a new connection. ChangesConnection Allocation Cap Enforcement
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request modifies the connection management logic for MySQL and PostgreSQL to enforce a cap on the total number of alive connections (Used + Free) by evicting a free connection when a new one is required. Feedback indicates that the updated server selection logic is flawed because it could allow the number of active connections to exceed the limit when free connections are available. Additionally, the calculation of total alive connections in the connection list managers is redundant as the counts are already available in existing local variables.
| // Cap is on alive connections (Used + Free). Allow when there is | ||
| // room, or when Free > 0 (a free conn can be reused or swapped in | ||
| // case-1, keeping the alive total bounded). | ||
| if ( | ||
| (mysrvc->ConnectionsUsed->conns_length() + mysrvc->ConnectionsFree->conns_length()) < mysrvc->max_connections | ||
| || mysrvc->ConnectionsFree->conns_length() > 0 | ||
| ) { // consider this server only if didn't reach max_connections |
There was a problem hiding this comment.
The updated gate logic (Used + Free) < max || Free > 0 is problematic. If Used == max and Free > 0, this condition evaluates to true, allowing a new request to proceed. This request will eventually result in Used becoming max + 1 (either by reusing a free connection or swapping one), which violates the intended limit on active backend connections. The original logic Used < max is sufficient to allow requests when there is room for another active connection, and the "swap" logic added in MySrvConnList.cpp ensures that the total number of connections (Used + Free) does not exceed the cap.
if (mysrvc->ConnectionsUsed->conns_length() < mysrvc->max_connections) { // consider this server only if didn't reach max_connections| unsigned int alive = | ||
| mysrvc->ConnectionsUsed->conns_length() + | ||
| mysrvc->ConnectionsFree->conns_length(); |
There was a problem hiding this comment.
| // Cap is on alive connections (Used + Free). Allow when there is | ||
| // room, or when Free > 0 (a free conn can be reused or swapped in | ||
| // case-1, keeping the alive total bounded). | ||
| if ( | ||
| (mysrvc->ConnectionsUsed->conns_length() + mysrvc->ConnectionsFree->conns_length()) < mysrvc->max_connections | ||
| || mysrvc->ConnectionsFree->conns_length() > 0 | ||
| ) { // consider this server only if didn't reach max_connections |
There was a problem hiding this comment.
The updated gate logic (Used + Free) < max || Free > 0 is problematic. If Used == max and Free > 0, this condition evaluates to true, allowing a new request to proceed. This request will eventually result in Used becoming max + 1 (either by reusing a free connection or swapping one), which violates the intended limit on active backend connections. The original logic Used < max is sufficient to allow requests when there is room for another active connection, and the "swap" logic added in PgSQL_SrvConnList::get_random_MyConn ensures that the total number of connections (Used + Free) does not exceed the cap.
if (mysrvc->ConnectionsUsed->conns_length() < mysrvc->max_connections) { // consider this server only if didn't reach max_connections| unsigned int alive = | ||
| mysrvc->ConnectionsUsed->conns_length() + | ||
| mysrvc->ConnectionsFree->conns_length(); |
There was a problem hiding this comment.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/MySrvConnList.cpp`:
- Around line 250-260: The code currently calls delete on a raw pointer returned
by conns->remove_index_fast(conn_found_idx) (stored in MySQL_Connection* stale)
which SonarCloud flags; change this to transfer the returned pointer into a
scoped std::unique_ptr<MySQL_Connection> (e.g., construct a unique_ptr from
conns->remove_index_fast(...)) so the connection is automatically deleted when
the unique_ptr goes out of scope; ensure <memory> is included and no other
behavior changes in the cap-enforcement block that references conns, conns_used,
conns_free, or mysrvc->max_connections.
In `@lib/PgSQL_HostGroups_Manager.cpp`:
- Around line 2416-2426: Replace the bare delete on the removed connection with
a scoped std::unique_ptr to satisfy the SonarCloud rule: after calling
conns->remove_index_fast(conn_found_idx) wrap the returned PgSQL_Connection* in
a std::unique_ptr<PgSQL_Connection> (e.g. unique_ptr<PgSQL_Connection>
stale(ptr)) so the object is deleted automatically when it goes out of scope;
ensure <memory> is included and leave the surrounding logic using alive,
conns_used, conns_free, mysrvc->max_connections, and conn_found_idx unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e441f6ee-177b-4d28-8158-600beb6a47a3
📒 Files selected for processing (2)
lib/MySrvConnList.cpplib/PgSQL_HostGroups_Manager.cpp
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: CI-builds / builds (debian12,-dbg)
- GitHub Check: CI-builds / builds (ubuntu22,-tap)
- GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
- GitHub Check: run / trigger
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,h,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{cpp,h,hpp}: Class names must use PascalCase with protocol prefixes (MySQL_,PgSQL_,ProxySQL_)
Member variables must use snake_case naming convention
Constants and macros must use UPPER_SNAKE_CASE naming convention
Use conditional compilation via#ifdef PROXYSQL31,#ifdef PROXYSQL40,#ifdef PROXYSQLFFTO,#ifdef PROXYSQLTSDB,#ifdef PROXYSQLCLICKHOUSEfor feature tier gating in core code
Files:
lib/PgSQL_HostGroups_Manager.cpplib/MySrvConnList.cpp
🪛 GitHub Check: SonarCloud Code Analysis
lib/PgSQL_HostGroups_Manager.cpp
[failure] 2425-2425: Rewrite the code so that you no longer need this "delete".
lib/MySrvConnList.cpp
[failure] 259-259: Rewrite the code so that you no longer need this "delete".
🔇 Additional comments (1)
lib/PgSQL_HostGroups_Manager.cpp (1)
1955-1960: ⚡ Quick winPgSQL gate fix is absent — free connections stranded when
Used == max_connectionsThe gate at line 1958 in
PgSQL_HostGroups_Manager.cpp::get_random_MySrvCdoes not admit servers whenConnectionsFree > 0whileConnectionsUsed == max_connections. Without this condition, free connections are permanently stranded for new clients, causing unnecessaryNULLreturns fromget_MyConn_from_pool.🔧 Proposed gate fix
- if (mysrvc->ConnectionsUsed->conns_length() < mysrvc->max_connections) { // consider this server only if didn't reach max_connections + if (mysrvc->ConnectionsUsed->conns_length() < (unsigned int)mysrvc->max_connections + || mysrvc->ConnectionsFree->conns_length() > 0) { // admit when room for new conn OR free conns available for reuseVerify whether the equivalent gate change was applied in
MySQL_HostGroups_Manager.cpp::MyHGC::get_random_MySrvC.
| // Only swap when we'd otherwise exceed the cap. | ||
| // When alive < max_connections there is room for the | ||
| // new conn, so keep the misfit free conn around — a | ||
| // later client may match it perfectly and reuse it. | ||
| // When alive >= max_connections we MUST delete the | ||
| // misfit first to keep alive bounded by max. | ||
| unsigned int alive = conns_used + conns_free; | ||
| if (alive >= (unsigned int)mysrvc->max_connections) { | ||
| MySQL_Connection* stale = (MySQL_Connection *)conns->remove_index_fast(conn_found_idx); | ||
| delete stale; | ||
| } |
There was a problem hiding this comment.
SonarCloud CI failure: replace delete stale with a scoped unique_ptr
The cap-enforcement logic is correct and symmetric with the PgSQL fix. SonarCloud flags line 259 (delete stale) with the same rule as the PgSQL file, blocking CI.
🔧 Proposed fix
- MySQL_Connection* stale = (MySQL_Connection *)conns->remove_index_fast(conn_found_idx);
- delete stale;
+ std::unique_ptr<MySQL_Connection> stale(
+ static_cast<MySQL_Connection*>(conns->remove_index_fast(conn_found_idx))
+ );🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis
[failure] 259-259: Rewrite the code so that you no longer need this "delete".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/MySrvConnList.cpp` around lines 250 - 260, The code currently calls
delete on a raw pointer returned by conns->remove_index_fast(conn_found_idx)
(stored in MySQL_Connection* stale) which SonarCloud flags; change this to
transfer the returned pointer into a scoped std::unique_ptr<MySQL_Connection>
(e.g., construct a unique_ptr from conns->remove_index_fast(...)) so the
connection is automatically deleted when the unique_ptr goes out of scope;
ensure <memory> is included and no other behavior changes in the cap-enforcement
block that references conns, conns_used, conns_free, or mysrvc->max_connections.
| // Only swap when we'd otherwise exceed the cap. | ||
| // When alive < max_connections there is room for the | ||
| // new conn, so keep the misfit free conn around — a | ||
| // later client may match it perfectly and reuse it. | ||
| // When alive >= max_connections we MUST delete the | ||
| // misfit first to keep alive bounded by max. | ||
| unsigned int alive = conns_used + conns_free; | ||
| if (alive >= (unsigned int)mysrvc->max_connections) { | ||
| PgSQL_Connection* stale = (PgSQL_Connection *)conns->remove_index_fast(conn_found_idx); | ||
| delete stale; | ||
| } |
There was a problem hiding this comment.
SonarCloud CI failure: replace delete stale with a scoped unique_ptr
The swap-before-allocate logic is correct — conns_free/conns_used are captured before the read-only inner search and all of this runs under wrlock(), so the cached values used for alive are accurate. However, SonarCloud flags line 2425 (delete stale), blocking the CI check.
A local unique_ptr satisfies the rule with no semantic change:
🔧 Proposed fix
- PgSQL_Connection* stale = (PgSQL_Connection *)conns->remove_index_fast(conn_found_idx);
- delete stale;
+ std::unique_ptr<PgSQL_Connection> stale(
+ static_cast<PgSQL_Connection*>(conns->remove_index_fast(conn_found_idx))
+ );🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis
[failure] 2425-2425: Rewrite the code so that you no longer need this "delete".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/PgSQL_HostGroups_Manager.cpp` around lines 2416 - 2426, Replace the bare
delete on the removed connection with a scoped std::unique_ptr to satisfy the
SonarCloud rule: after calling conns->remove_index_fast(conn_found_idx) wrap the
returned PgSQL_Connection* in a std::unique_ptr<PgSQL_Connection> (e.g.
unique_ptr<PgSQL_Connection> stale(ptr)) so the object is deleted automatically
when it goes out of scope; ensure <memory> is included and leave the surrounding
logic using alive, conns_used, conns_free, mysrvc->max_connections, and
conn_found_idx unchanged.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
test/tap/tests/pgsql-max_connections_cap_violation-t.cpp (1)
316-316: ⚡ Quick winPrefer bounded polling over fixed 300ms sleeps.
The fixed waits at Line 316, Line 348, and Line 358 can make this regression timing-sensitive in slower CI runs. A short poll-with-timeout around the expected state is more stable.
Also applies to: 348-349, 358-359
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp` at line 316, Replace the fixed std::this_thread::sleep_for(300ms) in the test (file pgsql-max_connections_cap_violation-t.cpp) with a bounded poll-with-timeout loop that repeatedly checks the expected condition (e.g., connection count/state) until it becomes true or a deadline elapses; implement this as a small helper (or inline loop) using std::chrono::steady_clock, short sleeps (e.g., 10–50ms) between polls, and a clear timeout (e.g., several seconds) and apply the same replacement for the other occurrences around Lines 348–349 and 358–359 so the test waits deterministically instead of fixed 300ms sleeps.test/tap/tests/mysql_max_connections_cap_violation-t.cpp (1)
256-256: ⚡ Quick winUse polling instead of fixed 300ms sleeps for state checks.
The fixed delays at Line 256, Line 284, and Line 298 can introduce CI flakiness. Polling with a bounded timeout for expected backend-count transitions is more deterministic.
Also applies to: 284-285, 298-299
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp` at line 256, Replace the ad-hoc std::this_thread::sleep_for(std::chrono::milliseconds(300)) calls with a bounded polling loop that repeatedly checks the expected backend-count transition until it succeeds or a timeout elapses; for each occurrence of std::this_thread::sleep_for(...) in mysql_max_connections_cap_violation-t.cpp, implement a small helper (inline in the test) that polls the actual backend count (the same check used immediately after the sleep) every ~25–100ms and returns success if the count matches the expected value, otherwise fail the test after a configurable timeout (e.g. 2–5s); remove the fixed 300ms sleeps at the three locations and use this polling helper to make the assertions deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp`:
- Around line 193-197: The setup uses exec_ok(admin, "...") but ignores its
boolean result; if these calls fail the test should immediately abort to avoid
running with bad preconditions. Update each exec_ok call in
mysql_max_connections_cap_violation-t.cpp (notably the block around the "SET
mysql-monitor_enabled" / "LOAD MYSQL VARIABLES TO RUNTIME" calls and the other
spots around lines 239-247 and 249-255) to check the returned bool and abort the
test on failure (e.g., fail the test/return early or call the existing test
abort helper) so that failures in exec_ok(admin, ...) stop further execution.
In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp`:
- Around line 253-257: Several critical setup calls use exec_ok(admin.get(),
"...") but ignore its boolean result; change each of those calls (e.g., the two
calls disabling the monitor using exec_ok(admin.get(), "SET
pgsql-monitor_enabled='false'") and "LOAD PGSQL VARIABLES TO RUNTIME", plus the
blocks around lines 299-307 and 309-315) to fail fast: check the return value
and abort the test immediately on false (for example, return false / throw /
call t->fatal/exit depending on the test harness) so the test does not continue
in a partially configured state; update every occurrence of exec_ok(...) in
these setup sections to perform this check.
---
Nitpick comments:
In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp`:
- Line 256: Replace the ad-hoc
std::this_thread::sleep_for(std::chrono::milliseconds(300)) calls with a bounded
polling loop that repeatedly checks the expected backend-count transition until
it succeeds or a timeout elapses; for each occurrence of
std::this_thread::sleep_for(...) in mysql_max_connections_cap_violation-t.cpp,
implement a small helper (inline in the test) that polls the actual backend
count (the same check used immediately after the sleep) every ~25–100ms and
returns success if the count matches the expected value, otherwise fail the test
after a configurable timeout (e.g. 2–5s); remove the fixed 300ms sleeps at the
three locations and use this polling helper to make the assertions
deterministic.
In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp`:
- Line 316: Replace the fixed std::this_thread::sleep_for(300ms) in the test
(file pgsql-max_connections_cap_violation-t.cpp) with a bounded
poll-with-timeout loop that repeatedly checks the expected condition (e.g.,
connection count/state) until it becomes true or a deadline elapses; implement
this as a small helper (or inline loop) using std::chrono::steady_clock, short
sleeps (e.g., 10–50ms) between polls, and a clear timeout (e.g., several
seconds) and apply the same replacement for the other occurrences around Lines
348–349 and 358–359 so the test waits deterministically instead of fixed 300ms
sleeps.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 53a5be1c-63ca-47cb-9264-1f550de55b24
📒 Files selected for processing (3)
test/tap/groups/groups.jsontest/tap/tests/mysql_max_connections_cap_violation-t.cpptest/tap/tests/pgsql-max_connections_cap_violation-t.cpp
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: CI-builds / builds (debian12,-dbg)
- GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
- GitHub Check: CI-builds / builds (ubuntu22,-tap)
- GitHub Check: run / trigger
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{cpp,h,hpp}: Class names must use PascalCase with protocol prefixes (MySQL_,PgSQL_,ProxySQL_)
Member variables must use snake_case naming convention
Constants and macros must use UPPER_SNAKE_CASE naming convention
Use conditional compilation via#ifdef PROXYSQL31,#ifdef PROXYSQL40,#ifdef PROXYSQLFFTO,#ifdef PROXYSQLTSDB,#ifdef PROXYSQLCLICKHOUSEfor feature tier gating in core code
Files:
test/tap/tests/pgsql-max_connections_cap_violation-t.cpptest/tap/tests/mysql_max_connections_cap_violation-t.cpp
test/tap/tests/*-t.cpp
📄 CodeRabbit inference engine (CLAUDE.md)
Test binaries are built via pattern rule in
test/tap/tests/Makefile:make <testname>-tcompiles<testname>-t.cppinto<testname>-twithout requiring explicit Makefile targets
Files:
test/tap/tests/pgsql-max_connections_cap_violation-t.cpptest/tap/tests/mysql_max_connections_cap_violation-t.cpp
🧠 Learnings (1)
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.
Applied to files:
test/tap/tests/pgsql-max_connections_cap_violation-t.cpptest/tap/tests/mysql_max_connections_cap_violation-t.cpp
🔇 Additional comments (1)
test/tap/groups/groups.json (1)
108-108: Test group wiring looks correct.Both new test binaries are mapped with valid group entries and follow the existing
groups.jsonconventions.Also applies to: 154-154
| // Disable monitor for the duration of the test. We never SAVE TO DISK | ||
| // here; proxysql-tester.py reloads mysql-monitor_enabled from disk | ||
| // before the next test runs, which restores the original value. | ||
| exec_ok(admin, "SET mysql-monitor_enabled='false'"); | ||
| exec_ok(admin, "LOAD MYSQL VARIABLES TO RUNTIME"); |
There was a problem hiding this comment.
Abort the test when setup mutations fail.
At Line 196 onward, critical setup operations call exec_ok(...) but ignore the boolean result. If one step fails, the test may proceed with invalid preconditions.
Suggested minimal pattern
+static bool must_exec(MYSQL* m, const string& q) {
+ if (!exec_ok(m, q)) {
+ diag("setup step failed, aborting test: %s", q.c_str());
+ return false;
+ }
+ return true;
+}
...
- exec_ok(admin, "SET mysql-monitor_enabled='false'");
- exec_ok(admin, "LOAD MYSQL VARIABLES TO RUNTIME");
+ if (!must_exec(admin, "SET mysql-monitor_enabled='false'")) { mysql_close(admin); return exit_status(); }
+ if (!must_exec(admin, "LOAD MYSQL VARIABLES TO RUNTIME")) { mysql_close(admin); return exit_status(); }Also applies to: 239-247, 249-255
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp` around lines 193 -
197, The setup uses exec_ok(admin, "...") but ignores its boolean result; if
these calls fail the test should immediately abort to avoid running with bad
preconditions. Update each exec_ok call in
mysql_max_connections_cap_violation-t.cpp (notably the block around the "SET
mysql-monitor_enabled" / "LOAD MYSQL VARIABLES TO RUNTIME" calls and the other
spots around lines 239-247 and 249-255) to check the returned bool and abort the
test on failure (e.g., fail the test/return early or call the existing test
abort helper) so that failures in exec_ok(admin, ...) stop further execution.
| // Disable monitor for the duration of the test. We never SAVE TO DISK | ||
| // here; proxysql-tester.py reloads pgsql-monitor_enabled from disk | ||
| // before the next test runs, which restores the original value. | ||
| exec_ok(admin.get(), "SET pgsql-monitor_enabled='false'"); | ||
| exec_ok(admin.get(), "LOAD PGSQL VARIABLES TO RUNTIME"); |
There was a problem hiding this comment.
Fail fast if setup SQL fails.
At Line 256 onward, several critical setup mutations ignore exec_ok(...) return values. If one fails, the test can continue in a partially configured state and produce misleading results.
Suggested minimal pattern
+static bool must_exec(PGconn* c, const string& q) {
+ if (!exec_ok(c, q)) {
+ diag("setup step failed, aborting test: %s", q.c_str());
+ return false;
+ }
+ return true;
+}
...
- exec_ok(admin.get(), "SET pgsql-monitor_enabled='false'");
- exec_ok(admin.get(), "LOAD PGSQL VARIABLES TO RUNTIME");
+ if (!must_exec(admin.get(), "SET pgsql-monitor_enabled='false'")) return exit_status();
+ if (!must_exec(admin.get(), "LOAD PGSQL VARIABLES TO RUNTIME")) return exit_status();Also applies to: 299-307, 309-315
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp` around lines 253 -
257, Several critical setup calls use exec_ok(admin.get(), "...") but ignore its
boolean result; change each of those calls (e.g., the two calls disabling the
monitor using exec_ok(admin.get(), "SET pgsql-monitor_enabled='false'") and
"LOAD PGSQL VARIABLES TO RUNTIME", plus the blocks around lines 299-307 and
309-315) to fail fast: check the return value and abort the test immediately on
false (for example, return false / throw / call t->fatal/exit depending on the
test harness) so the test does not continue in a partially configured state;
update every occurrence of exec_ok(...) in these setup sections to perform this
check.



Summary
ProxySQL exceeds
max_connectionson a backend server. The leak is a single off-by-one in the case-1 +create_new_connectionpath ofMySrvConnList::get_random_MyConn/PgSQL_SrvConnList::get_random_MyConn: when the matcher picks a misfit free conn butevaluate_pool_statedecides to allocate a new one (CHANGE_USER on MySQL, RESET SESSION onPgSQL), the old code created the new conn without removing the matched free conn, so
alive = Used + Freegrew past the cap. This change makes case-1 delete the misfit freeconn first when
alive >= max_connections, keeping the alive total bounded; whenalive < max_connectionsthere is headroom, so the misfit is left in place — a later clientmay match it perfectly and reuse it. The gate (
MyHGC::get_random_MySrvC,PgSQL_HGC::get_random_MySrvC) is unchanged:Used < max_connectionsis sufficient because thecase-1 swap maintains the invariant
alive <= max_connections.Closes #5767
Summary by CodeRabbit