Skip to content

Fix all failing fixture tests and improve formatting#4

Merged
gmr merged 4 commits intomainfrom
fix/failing-tests
Mar 30, 2026
Merged

Fix all failing fixture tests and improve formatting#4
gmr merged 4 commits intomainfrom
fix/failing-tests

Conversation

@gmr
Copy link
Copy Markdown
Owner

@gmr gmr commented Mar 30, 2026

Summary

  • Remove KNOWN_FAILING mechanism — all 65 fixture pairs now run with real assertions, no silent pass-on-failure
  • Fix expression formatting: :: typecast spacing, ARRAY[...], CAST():: conversion, COALESCE/function-like formatting, RANK() OVER() window functions, ANY/ALL/SOME spacing, redundant paren stripping, bare alias AS insertion
  • Fix statement formatting: CREATE FUNCTION dollar-quoting, CREATE TABLE WITH duplication and constraint flattening, CREATE VIEW trailing space, CREATE FOREIGN TABLE full implementation, TIMESTAMP WITH TIME ZONE type handling
  • Fix SELECT/CTE formatting: DISTINCT river alignment, CTE body width inheritance, nested WITH river alignment
  • Add wrap_case_else style config (AWeber/mattmc3 wrap CASE+ELSE, others don't)
  • Change default style from river to AWeber
  • Link all styles to source documentation in README
  • Add dump_tree and format_test development examples

Test plan

  • All 65 fixture pairs pass (cargo test all_fixture_pairs)
  • All 56 individual fixture tests pass
  • All 6 smoke tests pass
  • All 3 paren tests pass
  • All 6 doc-tests pass
  • cargo clippy -- -D warnings clean
  • cargo fmt --check clean

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • AWeber is now the default formatting style.
    • Improved CASE expression wrapping and river alignment for SELECT/CTE layouts.
  • Bug Fixes

    • Better handling of arrays, typecasts, window functions, DISTINCT placement, and various CREATE statements formatting.
  • Documentation

    • Updated style table with linked style names and adjusted style examples.
  • Examples

    • Added command-line example tools for dumping parse trees and formatting tests.
  • Tests

    • Test runner no longer treats known failures as expected; failures are reported uniformly.

gmr and others added 2 commits March 30, 2026 11:13
Remove the KNOWN_FAILING list that silently passed failing tests. All 65
fixture pairs now run with real assertions.

Expression fixes:
- Handle any_name nodes (fixes domain name truncation)
- Typecast :: no longer adds spaces in a_expr and a_expr_prec
- ARRAY[...] expressions formatted correctly with keyword merging
- CAST(expr AS type) converted to expr::type notation
- COALESCE/GREATEST/LEAST formatted as KEYWORD(args) not KEYWORD ( args )
- Decimal fragment merging for tree-sitter ERROR nodes (0 + .00 → 0.00)
- RANK() OVER() window functions: attach over_clause from func_expr level
- Window specification traversal fixed to use window_specification children
- ANY/ALL/SOME get space before opening paren
- VALUE pseudo-variable always lowercased in domain CHECK constraints
- Strip redundant parens around single simple expressions
- kw_numeric added to recognized type keywords
- Prevent leading space when base type is empty with extra_keywords
- Always add AS keyword for bare table aliases

Statement fixes:
- CREATE FUNCTION: AS $$ on same line, body indented, $$ on own line
- CREATE TABLE: fix WITH WITH duplication, flatten nested ColQualList
- CREATE VIEW: remove trailing space after AS
- CREATE FOREIGN TABLE: full implementation with column alignment
- Handle opt_timezone for WITH/WITHOUT TIME ZONE types

SELECT/CTE fixes:
- DISTINCT separated from SELECT keyword for proper river alignment
- CTE bodies inherit parent river width as minimum
- Nested WITH keyword river-aligned inside CTE bodies
- Set operations (UNION) don't inherit parent min_width

Style config:
- Add wrap_case_else flag (aweber/mattmc3 wrap CASE+ELSE, others don't)

Documentation:
- Link all styles to their source documentation in README
- Add dump_tree and format_test examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

Warning

Rate limit exceeded

@gmr has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 0 minutes and 7 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 0 minutes and 7 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ca0d2ee-452b-47b9-a809-19221a0c6516

📥 Commits

Reviewing files that changed from the base of the PR and between b793169 and 06f2b26.

📒 Files selected for processing (1)
  • src/formatter/stmt.rs
📝 Walkthrough

Walkthrough

This PR changes the default style to AWeber, refactors many expression/statement formatter paths (typecasts, arrays, CASE wrapping, window/OVER, CREATE/FUNCTION/FOREIGN TABLE formatting), threads a min-width for river-mode SELECT/CTE alignment, adds two example CLI programs, and updates README and tests/fixtures.

Changes

Cohort / File(s) Summary
Documentation & Crate Docs
README.md, src/lib.rs
Rewrote the "Supported 7 formatting styles" table with external links; removed "(default)" from River and moved default mentions to AWeber in crate docs.
Examples
examples/dump_tree.rs, examples/format_test.rs
Added two example binaries: an AST dump tool using tree-sitter and a CLI formatter that reads a file and formats using a chosen style (defaults to aweber).
Style & Config
src/style.rs, src/formatter/mod.rs
Changed Style::default() from River → AWeber; added wrap_case_else: bool to StyleConfig and set style-specific defaults (true for Aweber and Mattmc3).
Expression Formatting
src/formatter/expr.rs
Major updates to expression dispatch and formatting: merged b_expr → a_expr path, added array_expr handling, implemented pending_cast to render ::type without extra space, improved function/aggregate/OVER handling, ANY/ALL/SOME spacing, CASE inline vs. multiline wrapping based on config/length, qualified-name attrs, alias AS behavior, numeric decimal merging, and removal of some redundant parentheses.
SELECT / River Alignment
src/formatter/select.rs
Threaded a min-width parameter through river formatting APIs, adjusted river keyword width logic, special-cased set-ops to avoid min-width propagation, refactored WITH/CTE river alignment and DISTINCT handling to prepend DISTINCT to targets.
CREATE / Statement Formatting
src/formatter/stmt.rs
Refactored CREATE TABLE WITH emission and ColQualList traversal, improved CREATE VIEW spacing, added robust CREATE FUNCTION dollar-quote parsing/reindent logic, implemented structured CREATE FOREIGN TABLE formatting with river-aligned column layout and OPTIONS handling via new helpers.
Tests / Fixtures
tests/all_fixtures.rs, tests/fixtures/river/create_matview.expected
Removed KNOWN_FAILING allowlist and unified panic handling to always record failures; updated expected fixture for CREATE MATERIALIZED VIEW to match new compact JOIN formatting and WHERE wrapping.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 I nibble tokens, line by line,
AWeber leads the formatting vine.
CASE folds neat, casts hug tight (::),
Joins hop in, river finds its bite,
Two tiny tools to help you dine. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main objective of the PR: fixing failing fixture tests and improving formatting across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/failing-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
examples/dump_tree.rs (1)

26-30: Note: Semicolon handling doesn't cover line-comment edge case.

The library's format() function (in src/lib.rs) handles an edge case where input ends with a line comment (--), appending the semicolon on a new line. This example's simpler logic would produce invalid SQL in that scenario (e.g., SELECT 1 -- comment; where the semicolon becomes part of the comment).

For a debugging utility this is likely acceptable, but worth noting if you want parity with the library behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/dump_tree.rs` around lines 26 - 30, The current semicolon handling
for the local example simply appends ";" which can produce invalid SQL when the
input ends with a line comment; update the logic that builds the input string
(the code using variables sql and input) to mirror the library's format()
behavior: detect if the last non-empty line starts with a line-comment marker
("--") and, in that case, append the semicolon on a new line (e.g., "\n;")
instead of inline; otherwise keep the existing inline-semicolon behavior.
examples/format_test.rs (1)

9-12: Consider setting a non-zero exit code on formatting errors.

When format() returns an error, the program prints to stderr but exits with code 0, which can mislead shell scripts or CI pipelines that check exit status.

🔧 Proposed fix
     match format(sql.trim(), style) {
-        Ok(f) => print!("{f}\n"),
-        Err(e) => eprintln!("Error: {e}"),
+        Ok(f) => println!("{f}"),
+        Err(e) => {
+            eprintln!("Error: {e}");
+            std::process::exit(1);
+        }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/format_test.rs` around lines 9 - 12, The Err arm of the match on
format(sql.trim(), style) only prints to stderr but leaves the process exit code
as 0; update the Err(e) branch to exit with a non-zero status (e.g., call
std::process::exit(1) or return a Result from main and propagate the error) so
CI/shell detects failure — modify the Err(e) branch that currently does
eprintln!("Error: {e}") to both log the error and terminate with a non-zero
exit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 101-107: The README examples incorrectly label "River" as the
default; update the example headings so they match the actual default returned
by Style::default() in src/style.rs: remove "(default)" from the "River" heading
and add "(default)" to the "AWeber" (or "AWeber"/"Aweber" exact heading used)
heading in the examples section so the README and Style::default() are
consistent.

In `@src/formatter/expr.rs`:
- Around line 1070-1082: The opt_timezone handling currently pushes
kw_with/kw_without/kw_time/kw_zone into extra_keywords which causes timezone
qualifiers to appear before modifiers; change the code in the opt_timezone match
to push those tokens into a new timezone_keywords Vec (or similar) instead of
extra_keywords, then when building the final type string (where modifiers and
extra_keywords are assembled) append timezone_keywords after modifiers so the
output becomes TIMESTAMP(6) WITH TIME ZONE; update references to extra_keywords,
modifiers and opt_timezone in the formatter/expr.rs logic accordingly.
- Around line 661-670: The CAST-to-:: rewrite in the block that detects kw_cast
(using subexpr.has_child("kw_cast") with
subexpr.find_child_any(&["a_expr","c_expr"]) and subexpr.find_child("Typename"))
must wrap the formatted operand in parentheses to preserve semantics; change the
returned string built from self.format_expr(expr) and
self.format_typename(typename) so the operand is parenthesized (e.g., "(" +
self.format_expr(expr) + ")::" + self.format_typename(typename)) ensuring
expressions like CAST(a + b AS int) become (a + b)::int instead of a + b::int.

In `@src/formatter/stmt.rs`:
- Around line 769-772: The current code calls normalize_whitespace(body) inside
the dollar-quoted branch (in parse_dollar_quoted handling) which collapses
newlines and breaks function semantics; instead do not normalize the
dollar-quoted body — preserve its original line breaks and either re-indent each
line to match the surrounding AS block or, if a LANGUAGE directive exists for
the function, dispatch the body to the language-aware formatter (e.g.
tree-sitter-postgres) before pushing to parts (replace the
normalize_whitespace(body) call in the parse_dollar_quoted branch and keep the
parts.push(format!("{} {tag}\n {body}\n{tag}", self.kw("AS"))) structure but
with the preserved or language-formatted body).
- Around line 839-867: The river-mode branch currently filters only
TableElementKind::Column in classify_table_element and thus drops table-level
CHECK constraints; update the river branch in the block guarded by
self.config.river to also collect and render CHECK constraints (or other allowed
table-level checks) instead of ignoring non-Column results: when iterating
elements, keep non-column TableElementKind variants that represent CHECKs
(identify by the enum variant returned by classify_table_element), push them
into col_elements or a separate list, and render them using the same formatting
path used by the non-river branch (e.g., reuse format_table_element() or
replicate its CHECK formatting) so lines.push(...) includes CHECK constraints
alongside columns while preserving existing padding and comma logic.

---

Nitpick comments:
In `@examples/dump_tree.rs`:
- Around line 26-30: The current semicolon handling for the local example simply
appends ";" which can produce invalid SQL when the input ends with a line
comment; update the logic that builds the input string (the code using variables
sql and input) to mirror the library's format() behavior: detect if the last
non-empty line starts with a line-comment marker ("--") and, in that case,
append the semicolon on a new line (e.g., "\n;") instead of inline; otherwise
keep the existing inline-semicolon behavior.

In `@examples/format_test.rs`:
- Around line 9-12: The Err arm of the match on format(sql.trim(), style) only
prints to stderr but leaves the process exit code as 0; update the Err(e) branch
to exit with a non-zero status (e.g., call std::process::exit(1) or return a
Result from main and propagate the error) so CI/shell detects failure — modify
the Err(e) branch that currently does eprintln!("Error: {e}") to both log the
error and terminate with a non-zero exit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bf9c840a-4c7a-4269-ad77-6cbd8a0d7f00

📥 Commits

Reviewing files that changed from the base of the PR and between 43eb74d and 2e8ecce.

📒 Files selected for processing (11)
  • README.md
  • examples/dump_tree.rs
  • examples/format_test.rs
  • src/formatter/expr.rs
  • src/formatter/mod.rs
  • src/formatter/select.rs
  • src/formatter/stmt.rs
  • src/lib.rs
  • src/style.rs
  • tests/all_fixtures.rs
  • tests/fixtures/river/create_matview.expected

README.md Outdated
Comment on lines 101 to 107
**River** (default):

```sql
SELECT file_hash
FROM file_system
WHERE file_name = '.vimrc';
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistency: "River (default)" should be updated.

Line 101 still labels River as the default style, but the actual Style::default() now returns Aweber (per changes in src/style.rs). The table at lines 8-16 correctly marks AWeber as the default.

📝 Proposed fix
-**River** (default):
+**River**:

And add "(default)" to the AWeber heading if you want to highlight it in the examples section as well.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**River** (default):
```sql
SELECT file_hash
FROM file_system
WHERE file_name = '.vimrc';
```
**River**:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 101 - 107, The README examples incorrectly label
"River" as the default; update the example headings so they match the actual
default returned by Style::default() in src/style.rs: remove "(default)" from
the "River" heading and add "(default)" to the "AWeber" (or "AWeber"/"Aweber"
exact heading used) heading in the examples section so the README and
Style::default() are consistent.

Comment on lines +661 to +670
// CAST(expr AS type) → expr::type
if subexpr.has_child("kw_cast")
&& let Some(expr) = subexpr.find_child_any(&["a_expr", "c_expr"])
&& let Some(typename) = subexpr.find_child("Typename")
{
return format!(
"{}::{}",
self.format_expr(expr),
self.format_typename(typename)
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In PostgreSQL, is CAST(a + b AS integer)equivalent toa + b::integeror(a + b)::integer? What is the precedence of the ::typecast operator relative to+?

💡 Result:

CAST(a + b AS integer) is equivalent to (a + b)::integer, not to a + b::integer.

Reason: in PostgreSQL, the :: (typecast) operator has higher precedence than +, so:

  • a + b::integer parses as a + (b::integer) (cast applies only to b) [1]
  • (a + b)::integer (and CAST(a + b AS integer)) casts the result of the addition [2]

Precedence: :: binds tighter than + (addition/subtraction) [1].

Sources:
[1] PostgreSQL docs, Operator Precedence table (:: above +) https://www.postgresql.org/docs/12/sql-syntax-lexical.html
[2] PostgreSQL docs, Type cast syntaxes (CAST(...), :: for arbitrary expressions) https://www.postgresql.org/docs/18/sql-syntax-lexical.html


Wrap the CAST operand in parentheses when rewriting to ::.

The :: typecast operator has higher precedence than + in PostgreSQL. Rewriting CAST(a + b AS int) as a + b::int changes the semantics—it becomes a + (b::int) instead of (a + b)::int. The operand must always be parenthesized:

Fix
-            return format!(
-                "{}::{}",
-                self.format_expr(expr),
-                self.format_typename(typename)
-            );
+            let expr_text = self.format_expr(expr);
+            return format!("({expr_text})::{}", self.format_typename(typename));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// CAST(expr AS type) → expr::type
if subexpr.has_child("kw_cast")
&& let Some(expr) = subexpr.find_child_any(&["a_expr", "c_expr"])
&& let Some(typename) = subexpr.find_child("Typename")
{
return format!(
"{}::{}",
self.format_expr(expr),
self.format_typename(typename)
);
// CAST(expr AS type) → expr::type
if subexpr.has_child("kw_cast")
&& let Some(expr) = subexpr.find_child_any(&["a_expr", "c_expr"])
&& let Some(typename) = subexpr.find_child("Typename")
{
let expr_text = self.format_expr(expr);
return format!("({expr_text})::{}", self.format_typename(typename));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatter/expr.rs` around lines 661 - 670, The CAST-to-:: rewrite in the
block that detects kw_cast (using subexpr.has_child("kw_cast") with
subexpr.find_child_any(&["a_expr","c_expr"]) and subexpr.find_child("Typename"))
must wrap the formatted operand in parentheses to preserve semantics; change the
returned string built from self.format_expr(expr) and
self.format_typename(typename) so the operand is parenthesized (e.g., "(" +
self.format_expr(expr) + ")::" + self.format_typename(typename)) ensuring
expressions like CAST(a + b AS int) become (a + b)::int instead of a + b::int.

- Fix README default style label: River was incorrectly marked as default,
  but Style::default() returns Aweber
- Parenthesize CAST operands when rewriting to :: syntax to preserve
  operator precedence (:: binds tighter than arithmetic operators)
- Move timezone qualifiers (WITH/WITHOUT TIME ZONE) after type modifiers
  so TIMESTAMP(6) WITH TIME ZONE is produced correctly
- Preserve line breaks in dollar-quoted function bodies instead of
  collapsing whitespace, which could break -- comment semantics
- Include CHECK constraints in river-mode foreign table formatting
  instead of silently dropping non-column elements
- Exit with non-zero status on formatting errors in format_test example

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/formatter/stmt.rs (1)

951-962: Consider simplifying the match statement.

All three match arms perform the identical operation. Unless you're planning to add type-specific formatting later, the match can be replaced with a direct call:

♻️ Suggested simplification
     fn format_generic_option(&self, node: Node<'a>) -> String {
         let mut parts = Vec::new();
         let mut cursor = node.walk();
         for child in node.named_children(&mut cursor) {
-            match child.kind() {
-                "generic_option_name" => parts.push(self.format_expr(child)),
-                "generic_option_arg" => parts.push(self.format_expr(child)),
-                _ => parts.push(self.format_expr(child)),
-            }
+            parts.push(self.format_expr(child));
         }
         parts.join(" ")
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatter/stmt.rs` around lines 951 - 962, In format_generic_option, the
match on child.kind() (matching "generic_option_name", "generic_option_arg", and
default) all call the same self.format_expr(child); simplify by removing the
match and directly pushing self.format_expr(child) for each child in the
named_children loop (preserving parts, cursor, and the call to parts.join(" ")),
so future type-specific handling can be added later if needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/formatter/stmt.rs`:
- Around line 951-962: In format_generic_option, the match on child.kind()
(matching "generic_option_name", "generic_option_arg", and default) all call the
same self.format_expr(child); simplify by removing the match and directly
pushing self.format_expr(child) for each child in the named_children loop
(preserving parts, cursor, and the call to parts.join(" ")), so future
type-specific handling can be added later if needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 194976f0-1f83-4a88-87b3-a0d2008b2a62

📥 Commits

Reviewing files that changed from the base of the PR and between 2e8ecce and b793169.

📒 Files selected for processing (4)
  • README.md
  • examples/format_test.rs
  • src/formatter/expr.rs
  • src/formatter/stmt.rs
✅ Files skipped from review due to trivial changes (2)
  • examples/format_test.rs
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/formatter/expr.rs

All match arms performed the identical operation, so remove the
redundant match and call format_expr directly for each child.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gmr gmr merged commit 1561dd4 into main Mar 30, 2026
3 checks passed
@gmr gmr deleted the fix/failing-tests branch March 30, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant