Skip to content

Fix max_connections cap violation in pool matcher#5769

Open
rahim-kanji wants to merge 5 commits into
v3.0from
v3.0_cap_violation_5767
Open

Fix max_connections cap violation in pool matcher#5769
rahim-kanji wants to merge 5 commits into
v3.0from
v3.0_cap_violation_5767

Conversation

@rahim-kanji

@rahim-kanji rahim-kanji commented May 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

ProxySQL exceeds max_connections on a backend server. The leak is a single off-by-one in the case-1 + create_new_connection path of MySrvConnList::get_random_MyConn /
PgSQL_SrvConnList::get_random_MyConn: when the matcher picks a misfit free conn but evaluate_pool_state decides to allocate a new one (CHANGE_USER on MySQL, RESET SESSION on
PgSQL), the old code created the new conn without removing the matched free conn, so alive = Used + Free grew past the cap. This change makes case-1 delete the misfit free
conn first when alive >= max_connections, keeping the alive total bounded; when alive < max_connections there is headroom, so the misfit is left in place — a later client
may match it perfectly and reuse it. The gate (MyHGC::get_random_MySrvC, PgSQL_HGC::get_random_MySrvC) is unchanged: Used < max_connections is sufficient because the
case-1 swap maintains the invariant alive <= max_connections.

Closes #5767

Summary by CodeRabbit

  • Bug Fixes
    • Connection pool now enforces per-server max connection limits for MySQL and PostgreSQL paths, preventing over-allocation and ensuring free connections are reclaimed when capacity is reached.
  • Tests
    • Added regression tests validating max-connections behavior for MySQL and PgSQL scenarios.
  • Chores
    • Added test-group mappings to include new max-connections regression tests.

MyHGC::get_random_MySrvC counted only ConnectionsUsed, letting alive
exceed max_connections when combined with the case-1 leak. Count
Used+Free, falling through when Free>0 so a free conn can be reused
or swapped.
… cap

In MySrvConnList::get_random_MyConn case-1, when create_new_connection
fired the new conn was added to Used while the misfit was left in Free,
pushing alive past max_connections. Delete the misfit before allocating
when Used+Free >= max; below the cap leave it in Free for a possible
later quality-3 reuse.
Mirror of the MySQL fix on the PgSQL path:
- get_random_MySrvC counted only ConnectionsUsed against
  max_connections; count Used+Free, falling through when Free>0.
- get_random_MyConn case-1 with create_new_connection allocated
  without removing the misfit free conn, pushing alive past max.
  Delete the misfit before allocating when alive >= max.

PGOPTIONS / SET on a HIGH_WM tracked variable produces the quality-1
mismatch that drives this path.
@coderabbitai

coderabbitai Bot commented May 8, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

MySQL and PgSQL connection pool allocation logic now enforces the server's maximum connection limit by computing total alive connections (used plus free) and removing a selected free connection from the pool if the count would meet or exceed the cap before allocating a new connection.

Changes

Connection Allocation Cap Enforcement

Layer / File(s) Summary
Pre-Allocation Max Connections Cap
lib/MySrvConnList.cpp, lib/PgSQL_HostGroups_Manager.cpp
When connection_quality_level == 1 and decision.create_new_connection is true, compute alive = conns_used + conns_free. If alive >= mysrvc->max_connections, remove and delete the previously-selected free connection at conn_found_idx before allocating the new connection.
Test Groups
test/tap/groups/groups.json
Add mysql_max_connections_cap_violation-t and pgsql-max_connections_cap_violation-t group mappings.
MySQL Test Helpers & Discovery
test/tap/tests/mysql_max_connections_cap_violation-t.cpp
Add connection helpers, exec/query helpers, server loader, backend-count helper, admin startup and hostgroup discovery.
MySQL Test Flow & Assertions
test/tap/tests/mysql_max_connections_cap_violation-t.cpp
Drain-and-reinsert target server (max_connections=3), create pinned and short-lived/victim clients, observe backend session counts while victim is open, assert sessions <= MAX_CONN, cleanup.
PgSQL Test Helpers & Discovery
test/tap/tests/pgsql-max_connections_cap_violation-t.cpp
Add libpq connection helper, exec/query helpers, server loader, backend-count helper, MAX_CONN and types.
PgSQL Test Flow & Assertions
test/tap/tests/pgsql-max_connections_cap_violation-t.cpp
Drain-and-reinsert target server (max_connections=3), create client sequences to pollute free connections and trigger allocation, observe pg_stat_activity via BACKEND_DIRECT, assert sessions <= MAX_CONN, cleanup.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related issues

Possibly related PRs

  • sysown/proxysql#5502: Overlaps with changes to get_random_MyConn and eviction/create_new_connection behavior; related refactor/logic adjustments.

Suggested reviewers

  • renecannao

🐰 I nudged one free and let one go,
So the pool stays neat and numbers low.
Caps held steady, no extra fuss,
A rabbit cheers for orderly us!
🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly summarizes the main fix: addressing a max_connections cap violation bug in the pool matcher logic across both MySQL and PgSQL paths.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch v3.0_cap_violation_5767

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the connection management logic for MySQL and PostgreSQL to enforce a cap on the total number of alive connections (Used + Free) by evicting a free connection when a new one is required. Feedback indicates that the updated server selection logic is flawed because it could allow the number of active connections to exceed the limit when free connections are available. Additionally, the calculation of total alive connections in the connection list managers is redundant as the counts are already available in existing local variables.

Comment thread lib/MyHGC.cpp Outdated
Comment on lines +58 to +64
// Cap is on alive connections (Used + Free). Allow when there is
// room, or when Free > 0 (a free conn can be reused or swapped in
// case-1, keeping the alive total bounded).
if (
(mysrvc->ConnectionsUsed->conns_length() + mysrvc->ConnectionsFree->conns_length()) < mysrvc->max_connections
|| mysrvc->ConnectionsFree->conns_length() > 0
) { // consider this server only if didn't reach max_connections

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The updated gate logic (Used + Free) < max || Free > 0 is problematic. If Used == max and Free > 0, this condition evaluates to true, allowing a new request to proceed. This request will eventually result in Used becoming max + 1 (either by reusing a free connection or swapping one), which violates the intended limit on active backend connections. The original logic Used < max is sufficient to allow requests when there is room for another active connection, and the "swap" logic added in MySrvConnList.cpp ensures that the total number of connections (Used + Free) does not exceed the cap.

					if (mysrvc->ConnectionsUsed->conns_length() < mysrvc->max_connections) { // consider this server only if didn't reach max_connections

Comment thread lib/MySrvConnList.cpp Outdated
Comment on lines +256 to +258
unsigned int alive =
mysrvc->ConnectionsUsed->conns_length() +
mysrvc->ConnectionsFree->conns_length();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant calculation of connection counts. The variables conns_free and conns_used were already initialized at lines 194-195 and remain valid here as no modifications to the connection lists have occurred in this execution path.

						unsigned int alive = conns_used + conns_free;

Comment thread lib/PgSQL_HostGroups_Manager.cpp Outdated
Comment on lines +1958 to +1964
// Cap is on alive connections (Used + Free). Allow when there is
// room, or when Free > 0 (a free conn can be reused or swapped in
// case-1, keeping the alive total bounded).
if (
(mysrvc->ConnectionsUsed->conns_length() + mysrvc->ConnectionsFree->conns_length()) < mysrvc->max_connections
|| mysrvc->ConnectionsFree->conns_length() > 0
) { // consider this server only if didn't reach max_connections

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The updated gate logic (Used + Free) < max || Free > 0 is problematic. If Used == max and Free > 0, this condition evaluates to true, allowing a new request to proceed. This request will eventually result in Used becoming max + 1 (either by reusing a free connection or swapping one), which violates the intended limit on active backend connections. The original logic Used < max is sufficient to allow requests when there is room for another active connection, and the "swap" logic added in PgSQL_SrvConnList::get_random_MyConn ensures that the total number of connections (Used + Free) does not exceed the cap.

				if (mysrvc->ConnectionsUsed->conns_length() < mysrvc->max_connections) { // consider this server only if didn't reach max_connections

Comment thread lib/PgSQL_HostGroups_Manager.cpp Outdated
Comment on lines +2428 to +2430
unsigned int alive =
mysrvc->ConnectionsUsed->conns_length() +
mysrvc->ConnectionsFree->conns_length();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant calculation of connection counts. The variables conns_free and conns_used were already initialized at lines 2367-2368 and remain valid here as no modifications to the connection lists have occurred in this execution path.

						unsigned int alive = conns_used + conns_free;

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/MySrvConnList.cpp`:
- Around line 250-260: The code currently calls delete on a raw pointer returned
by conns->remove_index_fast(conn_found_idx) (stored in MySQL_Connection* stale)
which SonarCloud flags; change this to transfer the returned pointer into a
scoped std::unique_ptr<MySQL_Connection> (e.g., construct a unique_ptr from
conns->remove_index_fast(...)) so the connection is automatically deleted when
the unique_ptr goes out of scope; ensure <memory> is included and no other
behavior changes in the cap-enforcement block that references conns, conns_used,
conns_free, or mysrvc->max_connections.

In `@lib/PgSQL_HostGroups_Manager.cpp`:
- Around line 2416-2426: Replace the bare delete on the removed connection with
a scoped std::unique_ptr to satisfy the SonarCloud rule: after calling
conns->remove_index_fast(conn_found_idx) wrap the returned PgSQL_Connection* in
a std::unique_ptr<PgSQL_Connection> (e.g. unique_ptr<PgSQL_Connection>
stale(ptr)) so the object is deleted automatically when it goes out of scope;
ensure <memory> is included and leave the surrounding logic using alive,
conns_used, conns_free, mysrvc->max_connections, and conn_found_idx unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e441f6ee-177b-4d28-8158-600beb6a47a3

📥 Commits

Reviewing files that changed from the base of the PR and between 16f2a23 and 08aca05.

📒 Files selected for processing (2)
  • lib/MySrvConnList.cpp
  • lib/PgSQL_HostGroups_Manager.cpp
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: CI-builds / builds (debian12,-dbg)
  • GitHub Check: CI-builds / builds (ubuntu22,-tap)
  • GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
  • GitHub Check: run / trigger
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names must use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables must use snake_case naming convention
Constants and macros must use UPPER_SNAKE_CASE naming convention
Use conditional compilation via #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE for feature tier gating in core code

Files:

  • lib/PgSQL_HostGroups_Manager.cpp
  • lib/MySrvConnList.cpp
🪛 GitHub Check: SonarCloud Code Analysis
lib/PgSQL_HostGroups_Manager.cpp

[failure] 2425-2425: Rewrite the code so that you no longer need this "delete".

See more on https://sonarcloud.io/project/issues?id=sysown_proxysql&issues=AZ4HxvHUAn4pVuhofrTz&open=AZ4HxvHUAn4pVuhofrTz&pullRequest=5769

lib/MySrvConnList.cpp

[failure] 259-259: Rewrite the code so that you no longer need this "delete".

See more on https://sonarcloud.io/project/issues?id=sysown_proxysql&issues=AZ4HxvUWAn4pVuhofrT1&open=AZ4HxvUWAn4pVuhofrT1&pullRequest=5769

🔇 Additional comments (1)
lib/PgSQL_HostGroups_Manager.cpp (1)

1955-1960: ⚡ Quick win

PgSQL gate fix is absent — free connections stranded when Used == max_connections

The gate at line 1958 in PgSQL_HostGroups_Manager.cpp::get_random_MySrvC does not admit servers when ConnectionsFree > 0 while ConnectionsUsed == max_connections. Without this condition, free connections are permanently stranded for new clients, causing unnecessary NULL returns from get_MyConn_from_pool.

🔧 Proposed gate fix
-				if (mysrvc->ConnectionsUsed->conns_length() < mysrvc->max_connections) { // consider this server only if didn't reach max_connections
+				if (mysrvc->ConnectionsUsed->conns_length() < (unsigned int)mysrvc->max_connections
+				    || mysrvc->ConnectionsFree->conns_length() > 0) { // admit when room for new conn OR free conns available for reuse

Verify whether the equivalent gate change was applied in MySQL_HostGroups_Manager.cpp::MyHGC::get_random_MySrvC.

Comment thread lib/MySrvConnList.cpp
Comment on lines +250 to +260
// Only swap when we'd otherwise exceed the cap.
// When alive < max_connections there is room for the
// new conn, so keep the misfit free conn around — a
// later client may match it perfectly and reuse it.
// When alive >= max_connections we MUST delete the
// misfit first to keep alive bounded by max.
unsigned int alive = conns_used + conns_free;
if (alive >= (unsigned int)mysrvc->max_connections) {
MySQL_Connection* stale = (MySQL_Connection *)conns->remove_index_fast(conn_found_idx);
delete stale;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

SonarCloud CI failure: replace delete stale with a scoped unique_ptr

The cap-enforcement logic is correct and symmetric with the PgSQL fix. SonarCloud flags line 259 (delete stale) with the same rule as the PgSQL file, blocking CI.

🔧 Proposed fix
-						MySQL_Connection* stale = (MySQL_Connection *)conns->remove_index_fast(conn_found_idx);
-						delete stale;
+						std::unique_ptr<MySQL_Connection> stale(
+							static_cast<MySQL_Connection*>(conns->remove_index_fast(conn_found_idx))
+						);
🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis

[failure] 259-259: Rewrite the code so that you no longer need this "delete".

See more on https://sonarcloud.io/project/issues?id=sysown_proxysql&issues=AZ4HxvUWAn4pVuhofrT1&open=AZ4HxvUWAn4pVuhofrT1&pullRequest=5769

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/MySrvConnList.cpp` around lines 250 - 260, The code currently calls
delete on a raw pointer returned by conns->remove_index_fast(conn_found_idx)
(stored in MySQL_Connection* stale) which SonarCloud flags; change this to
transfer the returned pointer into a scoped std::unique_ptr<MySQL_Connection>
(e.g., construct a unique_ptr from conns->remove_index_fast(...)) so the
connection is automatically deleted when the unique_ptr goes out of scope;
ensure <memory> is included and no other behavior changes in the cap-enforcement
block that references conns, conns_used, conns_free, or mysrvc->max_connections.

Comment on lines +2416 to +2426
// Only swap when we'd otherwise exceed the cap.
// When alive < max_connections there is room for the
// new conn, so keep the misfit free conn around — a
// later client may match it perfectly and reuse it.
// When alive >= max_connections we MUST delete the
// misfit first to keep alive bounded by max.
unsigned int alive = conns_used + conns_free;
if (alive >= (unsigned int)mysrvc->max_connections) {
PgSQL_Connection* stale = (PgSQL_Connection *)conns->remove_index_fast(conn_found_idx);
delete stale;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

SonarCloud CI failure: replace delete stale with a scoped unique_ptr

The swap-before-allocate logic is correct — conns_free/conns_used are captured before the read-only inner search and all of this runs under wrlock(), so the cached values used for alive are accurate. However, SonarCloud flags line 2425 (delete stale), blocking the CI check.

A local unique_ptr satisfies the rule with no semantic change:

🔧 Proposed fix
-						PgSQL_Connection* stale = (PgSQL_Connection *)conns->remove_index_fast(conn_found_idx);
-						delete stale;
+						std::unique_ptr<PgSQL_Connection> stale(
+							static_cast<PgSQL_Connection*>(conns->remove_index_fast(conn_found_idx))
+						);
🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis

[failure] 2425-2425: Rewrite the code so that you no longer need this "delete".

See more on https://sonarcloud.io/project/issues?id=sysown_proxysql&issues=AZ4HxvHUAn4pVuhofrTz&open=AZ4HxvHUAn4pVuhofrTz&pullRequest=5769

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/PgSQL_HostGroups_Manager.cpp` around lines 2416 - 2426, Replace the bare
delete on the removed connection with a scoped std::unique_ptr to satisfy the
SonarCloud rule: after calling conns->remove_index_fast(conn_found_idx) wrap the
returned PgSQL_Connection* in a std::unique_ptr<PgSQL_Connection> (e.g.
unique_ptr<PgSQL_Connection> stale(ptr)) so the object is deleted automatically
when it goes out of scope; ensure <memory> is included and leave the surrounding
logic using alive, conns_used, conns_free, mysrvc->max_connections, and
conn_found_idx unchanged.

@sonarqubecloud

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
test/tap/tests/pgsql-max_connections_cap_violation-t.cpp (1)

316-316: ⚡ Quick win

Prefer bounded polling over fixed 300ms sleeps.

The fixed waits at Line 316, Line 348, and Line 358 can make this regression timing-sensitive in slower CI runs. A short poll-with-timeout around the expected state is more stable.

Also applies to: 348-349, 358-359

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp` at line 316,
Replace the fixed std::this_thread::sleep_for(300ms) in the test (file
pgsql-max_connections_cap_violation-t.cpp) with a bounded poll-with-timeout loop
that repeatedly checks the expected condition (e.g., connection count/state)
until it becomes true or a deadline elapses; implement this as a small helper
(or inline loop) using std::chrono::steady_clock, short sleeps (e.g., 10–50ms)
between polls, and a clear timeout (e.g., several seconds) and apply the same
replacement for the other occurrences around Lines 348–349 and 358–359 so the
test waits deterministically instead of fixed 300ms sleeps.
test/tap/tests/mysql_max_connections_cap_violation-t.cpp (1)

256-256: ⚡ Quick win

Use polling instead of fixed 300ms sleeps for state checks.

The fixed delays at Line 256, Line 284, and Line 298 can introduce CI flakiness. Polling with a bounded timeout for expected backend-count transitions is more deterministic.

Also applies to: 284-285, 298-299

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp` at line 256,
Replace the ad-hoc std::this_thread::sleep_for(std::chrono::milliseconds(300))
calls with a bounded polling loop that repeatedly checks the expected
backend-count transition until it succeeds or a timeout elapses; for each
occurrence of std::this_thread::sleep_for(...) in
mysql_max_connections_cap_violation-t.cpp, implement a small helper (inline in
the test) that polls the actual backend count (the same check used immediately
after the sleep) every ~25–100ms and returns success if the count matches the
expected value, otherwise fail the test after a configurable timeout (e.g.
2–5s); remove the fixed 300ms sleeps at the three locations and use this polling
helper to make the assertions deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp`:
- Around line 193-197: The setup uses exec_ok(admin, "...") but ignores its
boolean result; if these calls fail the test should immediately abort to avoid
running with bad preconditions. Update each exec_ok call in
mysql_max_connections_cap_violation-t.cpp (notably the block around the "SET
mysql-monitor_enabled" / "LOAD MYSQL VARIABLES TO RUNTIME" calls and the other
spots around lines 239-247 and 249-255) to check the returned bool and abort the
test on failure (e.g., fail the test/return early or call the existing test
abort helper) so that failures in exec_ok(admin, ...) stop further execution.

In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp`:
- Around line 253-257: Several critical setup calls use exec_ok(admin.get(),
"...") but ignore its boolean result; change each of those calls (e.g., the two
calls disabling the monitor using exec_ok(admin.get(), "SET
pgsql-monitor_enabled='false'") and "LOAD PGSQL VARIABLES TO RUNTIME", plus the
blocks around lines 299-307 and 309-315) to fail fast: check the return value
and abort the test immediately on false (for example, return false / throw /
call t->fatal/exit depending on the test harness) so the test does not continue
in a partially configured state; update every occurrence of exec_ok(...) in
these setup sections to perform this check.

---

Nitpick comments:
In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp`:
- Line 256: Replace the ad-hoc
std::this_thread::sleep_for(std::chrono::milliseconds(300)) calls with a bounded
polling loop that repeatedly checks the expected backend-count transition until
it succeeds or a timeout elapses; for each occurrence of
std::this_thread::sleep_for(...) in mysql_max_connections_cap_violation-t.cpp,
implement a small helper (inline in the test) that polls the actual backend
count (the same check used immediately after the sleep) every ~25–100ms and
returns success if the count matches the expected value, otherwise fail the test
after a configurable timeout (e.g. 2–5s); remove the fixed 300ms sleeps at the
three locations and use this polling helper to make the assertions
deterministic.

In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp`:
- Line 316: Replace the fixed std::this_thread::sleep_for(300ms) in the test
(file pgsql-max_connections_cap_violation-t.cpp) with a bounded
poll-with-timeout loop that repeatedly checks the expected condition (e.g.,
connection count/state) until it becomes true or a deadline elapses; implement
this as a small helper (or inline loop) using std::chrono::steady_clock, short
sleeps (e.g., 10–50ms) between polls, and a clear timeout (e.g., several
seconds) and apply the same replacement for the other occurrences around Lines
348–349 and 358–359 so the test waits deterministically instead of fixed 300ms
sleeps.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 53a5be1c-63ca-47cb-9264-1f550de55b24

📥 Commits

Reviewing files that changed from the base of the PR and between 08aca05 and b1faec5.

📒 Files selected for processing (3)
  • test/tap/groups/groups.json
  • test/tap/tests/mysql_max_connections_cap_violation-t.cpp
  • test/tap/tests/pgsql-max_connections_cap_violation-t.cpp
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: CI-builds / builds (debian12,-dbg)
  • GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
  • GitHub Check: CI-builds / builds (ubuntu22,-tap)
  • GitHub Check: run / trigger
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names must use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables must use snake_case naming convention
Constants and macros must use UPPER_SNAKE_CASE naming convention
Use conditional compilation via #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE for feature tier gating in core code

Files:

  • test/tap/tests/pgsql-max_connections_cap_violation-t.cpp
  • test/tap/tests/mysql_max_connections_cap_violation-t.cpp
test/tap/tests/*-t.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test binaries are built via pattern rule in test/tap/tests/Makefile: make <testname>-t compiles <testname>-t.cpp into <testname>-t without requiring explicit Makefile targets

Files:

  • test/tap/tests/pgsql-max_connections_cap_violation-t.cpp
  • test/tap/tests/mysql_max_connections_cap_violation-t.cpp
🧠 Learnings (1)
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

  • test/tap/tests/pgsql-max_connections_cap_violation-t.cpp
  • test/tap/tests/mysql_max_connections_cap_violation-t.cpp
🔇 Additional comments (1)
test/tap/groups/groups.json (1)

108-108: Test group wiring looks correct.

Both new test binaries are mapped with valid group entries and follow the existing groups.json conventions.

Also applies to: 154-154

Comment on lines +193 to +197
// Disable monitor for the duration of the test. We never SAVE TO DISK
// here; proxysql-tester.py reloads mysql-monitor_enabled from disk
// before the next test runs, which restores the original value.
exec_ok(admin, "SET mysql-monitor_enabled='false'");
exec_ok(admin, "LOAD MYSQL VARIABLES TO RUNTIME");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Abort the test when setup mutations fail.

At Line 196 onward, critical setup operations call exec_ok(...) but ignore the boolean result. If one step fails, the test may proceed with invalid preconditions.

Suggested minimal pattern
+static bool must_exec(MYSQL* m, const string& q) {
+	if (!exec_ok(m, q)) {
+		diag("setup step failed, aborting test: %s", q.c_str());
+		return false;
+	}
+	return true;
+}
...
-	exec_ok(admin, "SET mysql-monitor_enabled='false'");
-	exec_ok(admin, "LOAD MYSQL VARIABLES TO RUNTIME");
+	if (!must_exec(admin, "SET mysql-monitor_enabled='false'")) { mysql_close(admin); return exit_status(); }
+	if (!must_exec(admin, "LOAD MYSQL VARIABLES TO RUNTIME")) { mysql_close(admin); return exit_status(); }

Also applies to: 239-247, 249-255

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/mysql_max_connections_cap_violation-t.cpp` around lines 193 -
197, The setup uses exec_ok(admin, "...") but ignores its boolean result; if
these calls fail the test should immediately abort to avoid running with bad
preconditions. Update each exec_ok call in
mysql_max_connections_cap_violation-t.cpp (notably the block around the "SET
mysql-monitor_enabled" / "LOAD MYSQL VARIABLES TO RUNTIME" calls and the other
spots around lines 239-247 and 249-255) to check the returned bool and abort the
test on failure (e.g., fail the test/return early or call the existing test
abort helper) so that failures in exec_ok(admin, ...) stop further execution.

Comment on lines +253 to +257
// Disable monitor for the duration of the test. We never SAVE TO DISK
// here; proxysql-tester.py reloads pgsql-monitor_enabled from disk
// before the next test runs, which restores the original value.
exec_ok(admin.get(), "SET pgsql-monitor_enabled='false'");
exec_ok(admin.get(), "LOAD PGSQL VARIABLES TO RUNTIME");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fail fast if setup SQL fails.

At Line 256 onward, several critical setup mutations ignore exec_ok(...) return values. If one fails, the test can continue in a partially configured state and produce misleading results.

Suggested minimal pattern
+static bool must_exec(PGconn* c, const string& q) {
+	if (!exec_ok(c, q)) {
+		diag("setup step failed, aborting test: %s", q.c_str());
+		return false;
+	}
+	return true;
+}
...
-	exec_ok(admin.get(), "SET pgsql-monitor_enabled='false'");
-	exec_ok(admin.get(), "LOAD PGSQL VARIABLES TO RUNTIME");
+	if (!must_exec(admin.get(), "SET pgsql-monitor_enabled='false'")) return exit_status();
+	if (!must_exec(admin.get(), "LOAD PGSQL VARIABLES TO RUNTIME")) return exit_status();

Also applies to: 299-307, 309-315

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/pgsql-max_connections_cap_violation-t.cpp` around lines 253 -
257, Several critical setup calls use exec_ok(admin.get(), "...") but ignore its
boolean result; change each of those calls (e.g., the two calls disabling the
monitor using exec_ok(admin.get(), "SET pgsql-monitor_enabled='false'") and
"LOAD PGSQL VARIABLES TO RUNTIME", plus the blocks around lines 299-307 and
309-315) to fail fast: check the return value and abort the test immediately on
false (for example, return false / throw / call t->fatal/exit depending on the
test harness) so the test does not continue in a partially configured state;
update every occurrence of exec_ok(...) in these setup sections to perform this
check.

@rahim-kanji rahim-kanji marked this pull request as ready for review May 10, 2026 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backend connection exceeds pgsql_servers/mysql_servers max_connections

1 participant