Skip to content

Fix agent work with sql skill#11536

Merged
ea-rus merged 14 commits into
developfrom
agent-sql-fix
Sep 16, 2025
Merged

Fix agent work with sql skill#11536
ea-rus merged 14 commits into
developfrom
agent-sql-fix

Conversation

@ea-rus

@ea-rus ea-rus commented Sep 8, 2025

Copy link
Copy Markdown
Contributor

Description

Updates:

  • Update instructions to force llm to use mysql quiting style (backticks for identifiers)
    • also removed mentioning postgres database in main prompt for it
  • disabled sql_parser_tool. @dusvyat, do we have some benchmarks that we might use to check that this change doesn't degrade quality?
    • it adds extra step for llm before executing query (increases response time).
    • also its payload is not clear: it just parses the query and render it again. db_query also shows error in case of parsing error
  • Fix error if with postgresql when column names are in mixed case (commit)
  • removed unused line in prompt in main template (Here is the user's question:)
  • updated error messages if a wrong database name is used (instead of Table name should contain only one part)
  • accept wrong quoting for 3-item identifier (handle cases like db.schema.table)
  • fix getting table info for 3-item identifier (db.schema.tablle)

Tested with queries from issues below:

--- for CONN-1442
select * from financial_agent where question = "What stocks are in the portfolio?";
select * from financial_agent where question = "What's the percentage breakdown of investments by category in the portfolio";
select * from financial_agent where question = "What category has the most exposure?";
select * from financial_agent where question = "Leadership has mandated that we limit our exposure to crypto to 10%. Is the portfolio compliant and if not, what must I do to make it compliant?";

--- for CONN-1403
select * from supplychain_agent where question ="What is the current stock level of a specific product: Radiance Ritual?";
select * from supplychain_agent where question ="Which products are low in stock and need reordering?";
select * from supplychain_agent where question ="What are the recent shipment details for a SKU2?";
select * from supplychain_agent where question ="How long does it typically take for a SKU2 to be delivered?";
select * from supplychain_agent where question ="What are the details of SKU5, such as its price and description?";
select * from supplychain_agent where question ="Can you list all the products from a cosmetics category?";
select * from supplychain_agent where question ="Who are the suppliers for SKU5?";
select * from supplychain_agent where question ="What are the contact details of supplier of SKU5?";
select * from supplychain_agent where question ="Can you provide a summary of the inventory status?";
select * from supplychain_agent where question ="What are the top-selling products?";

--- for CONN-1500 
select * from my_sf_agent where question = 'show me the top 10 customers by total order value'
select * from my_sf_agent where question = 'which suppliers provide parts to a specific region'
select * from my_sf_agent where question = 'what the most frequient ordered parts'
select * from my_sf_agent where question = 'what are the columns of table customers?'

--- for CONN-1459 
select * from my_agent where question = 'does O_ORDERPRIORITY column has a null values?'

Fixes https://linear.app/mindsdb/issue/CONN-1442/bug-agentsintermittent-issues-in-executing-sql-query (#11457)

  • updated instructions to use mysql dialect & fix getting sample rows

Fixes https://linear.app/mindsdb/issue/CONN-1403/agents-unable-to-retrieve-data-from-certain-tables-it-has-access-to

  • fixed error message in case of wrong database

Fixes https://linear.app/mindsdb/issue/CONN-1500/bug-table-not-found-for-snowflake-tables (#11529)
Fixes https://linear.app/mindsdb/issue/CONN-1459/bug-frequently-getting-table-tbl-not-found (#11394)

  • fixes for 3-item identifier

Type of change

  • 🐛 Bug fix (non-breaking change which fixes an issue)

Verification Process

To ensure the changes are working as expected:

  • Test Location: Specify the URL or path for testing.
  • Verification Steps: Outline the steps or queries needed to validate the change. Include any data, configurations, or actions required to reproduce or see the new functionality.

Additional Media:

  • I have attached a brief loom video or screenshots showcasing the new functionality or change.

Checklist:

  • My code follows the style guidelines(PEP 8) of MindsDB.
  • I have appropriately commented on my code, especially in complex areas.
  • Necessary documentation updates are either made or tracked in issues.
  • Relevant unit and integration tests are updated or added.

@ea-rus ea-rus requested review from StpMax and dusvyat September 8, 2025 11:18
@entelligence-ai-pr-reviews

Copy link
Copy Markdown
Contributor

🔒 Entelligence AI Vulnerability Scanner

No security vulnerabilities found!

Your code passed our comprehensive security analysis.

📊 Files Analyzed: 3 files


@entelligence-ai-pr-reviews

Copy link
Copy Markdown
Contributor

Review Summary

🏷️ Draft Comments (2)

Skipped posting 2 draft comments that were valid but scored below your review threshold (>=13/15). Feel free to update them here.

mindsdb/interfaces/skills/custom/text2sql/mindsdb_sql_toolkit.py (1)

21-102: get_tools constructs large multiline string SQL instructions on every call, causing repeated CPU and memory overhead for each agent/toolkit instantiation.

📊 Impact Scores:

  • Production Impact: 3/5
  • Fix Specificity: 4/5
  • Urgency Impact: 2/5
  • Total Score: 9/15

🤖 AI Agent Prompt (Copy & Paste Ready):

Refactor mindsdb/interfaces/skills/custom/text2sql/mindsdb_sql_toolkit.py lines 21-102: Move the large multiline SQL instruction string templates (especially for query_sql_database_tool_description) to class-level constants or static variables, and use .format() to inject dynamic values as needed. This avoids reconstructing large strings on every get_tools() call, reducing repeated CPU and memory overhead for each agent/toolkit instantiation. Preserve all logic and formatting.

mindsdb/interfaces/skills/sql_agent.py (1)

588-598: _get_sample_rows now uses select * instead of selecting only the provided fields, which can cause column order mismatches and incorrect CSV headers if the table has more columns than fields.

📊 Impact Scores:

  • Production Impact: 4/5
  • Fix Specificity: 5/5
  • Urgency Impact: 3/5
  • Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):

In mindsdb/interfaces/skills/sql_agent.py, lines 588-598, the `_get_sample_rows` method was changed to use `select *` instead of selecting only the provided `fields`. This can cause the returned sample rows to have columns in a different order or extra columns, leading to mismatches with the CSV header and potentially incorrect data shown to users. Please change the SQL command back to `select {', '.join(fields)} from {table} limit {self._sample_rows_in_table_info};` to ensure the sample rows match the expected fields.

@ea-rus ea-rus changed the title Fix agent access to sql fixes sql fix Fix agent work with sql tables Sep 8, 2025
@ea-rus ea-rus changed the title Fix agent work with sql tables Fix agent work with sql skill Sep 8, 2025
@ea-rus

ea-rus commented Sep 12, 2025

Copy link
Copy Markdown
Contributor Author

@StpMax, @dusvyat, the PR is ready for review, I'm not going to add anything to it

dusvyat
dusvyat previously approved these changes Sep 12, 2025
@ea-rus ea-rus merged commit c2b4da0 into develop Sep 16, 2025
18 checks passed
@ea-rus ea-rus deleted the agent-sql-fix branch September 16, 2025 04:23
@github-actions github-actions Bot locked and limited conversation to collaborators Sep 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants