Skip to content

Better handle trailing commas with comments#126

Open
willbeason wants to merge 7 commits intogoogle:mainfrom
willbeason:trailing-commas
Open

Better handle trailing commas with comments#126
willbeason wants to merge 7 commits intogoogle:mainfrom
willbeason:trailing-commas

Conversation

@willbeason
Copy link
Contributor

@willbeason willbeason commented Mar 19, 2026

Issue: #33

For lines/sequences of blocks which appear to be a comma-delimited list, do not consider comments when performing this check. Also, insert commas before comments to avoid mangling list entries.

I put the match/replace logic in lineGroup, as hasSuffix/trimSuffix were there.

This approach does have a few weaknesses. It doesn't properly handle sorting entries with quoted comment characters, e.g.:

  "c",
  "b, //",
  "a, #"

It is better at handling the desired cases, and the above counterexamples where it fails feel more like less-common edge cases. Importantly though - with the preexisting logic, the above would be handled correctly, but now would not be.

Lastly, it always tries to match all comment types. So even if "#" isn't a valid comment character in the context, it will still try to use it. I attempted an approach that used the opt.commentMarker, but that is often not set in tests. I could go and add it to existing tests where it matters, and then we could assume from how the blocks are defined which comment markers are relevant. This would get closer to handling the above well.

Fully "correctly" handling the above would need a bit more work - likely parsing the lines themselves and adding quote_delimiter and/or comment_delimiter options.

@willbeason
Copy link
Contributor Author

Leaving it as open for discussion about whether/how to detect the correct comment characters.

3 # three
keep-sorted-test end

Trailing comma except for last, last has a comment
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the inconsistent spacing here. I played with an approach that made spacing consistently one space after the comma, but it was a bit more complex. I have this behavior in the PR since the implementation is simpler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, yeah, that's weird. I'd somewhat prefer if we kept the spacing the same as what the user wrote originally. Oh I see, the .in file had two spaces to keep the comments aligned, and then we added a comma which bumped it out by one.

Maybe we should replace a single space with a comma, but if the comment only has a single leading space, preserve that space?

@willbeason willbeason marked this pull request as ready for review March 19, 2026 22:13
For lines/sequences of blocks which appear to be a comma-delimited list,
do not consider comments when performing this check. Also, insert commas
before comments to avoid mangling list.

I kept the match/replace logic in lineGroup, as hasSuffix/trimSuffix were
there.

This approach does have a few weaknesses. It doesn't properly handle
sorting what appear to be quoted comments, e.g.:

- "a, #",
- "b, //",
- c

It also doesn't properly handle commented commas:

- a, # some comment, # more comment
- b, # another comment, # a comment

It is better at handling the desired cases, and the above counterexamples where
it fails feel more like less-common edge cases.

Lastly, it assumes "#" and "//" are always comments, and doesn't allow for
alternatives.

Handling the above would need quite a bit more work - likely parsing the lines
themselves and adding list_delimiter and/or comment_delimiter options.
Comment spacing in lists is now handled consistently. Comments begin one
space character after the comma delimiter, except for the terminal element,
which has no comma and instead has two spaces before any comment.
endsInComma was unnecessary since it's handled by a lineGroup method.
Undo changes to comments made by renaming via the IDE.
Remove global regexp variables and move them into the function directly.

Also change to "allMatchSuffix" to be closer to original code.
Simplify implementation to not worry about spaces consistency. Instead, just
preserve spaces before comments as-is.

Also add examples that show when comments still aren't handled consistently.
@JeffFaer JeffFaer self-requested a review March 23, 2026 19:13

if n := len(dataGroups); n > 1 && allHaveSuffix(dataGroups[0:n-1], ",") && !dataGroups[n-1].hasSuffix(",") {
dataGroups[n-1].append(",")
knownCommentMarkersList := knownCommentMarkers()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block.metadata.opts records which comment marker is used in the keep-sorted start directive. That's what we use for sticky_comments and that can be expanded with sticky_prefixes (and that commentMarker is automatically added to the list of sticky_prefixes assuming sticky_comments is true)

What do you think about using block.metadata.opts.StickyPrefixes for this instead of knownCommentMarkers()? That gives users a lever they can use to configure this behavior

lineEnd := regexp.MustCompile(fmt.Sprintf("(\\s*(%s).*)?$", knownCommentMarkersStr))

if n := len(dataGroups); n > 1 && allMatchSuffix(dataGroups[0:n-1], commaLineEnd) && !dataGroups[n-1].matchesSuffix(commaLineEnd) {
dataGroups[n-1].replaceSuffix(lineEnd, ",$1")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you happen to know how nested capturing groups behave in go? Is $1 referencing (\\s*(%s).*) or is it referencing just (%s)?

Either way, we only seem to be using $1, so consider using a non-capturing group for the other set of parens so that you don't need to answer this question ;)

}
knownCommentMarkersStr := strings.Join(knownCommentMarkersList, "|")

commaLineEnd := regexp.MustCompile(fmt.Sprintf(",(\\s*(%s).*)?$", knownCommentMarkersStr))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use grave ticks so you don't need to escape the backslash

for i := len(lgs) - 1; i >= 0; i-- {
if len(lgs[i].lines) > 0 {
lgs[i].trimSuffix(",")
lgs[i].replaceSuffix(commaLineEnd, "$1")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this why the spacing seems weird? Should we replace this with " $1" so that the comma is essentially replaced with a space?

3 # three
keep-sorted-test end

Trailing comma except for last, last has a comment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, yeah, that's weird. I'd somewhat prefer if we kept the spacing the same as what the user wrote originally. Oh I see, the .in file had two spaces to keep the comments aligned, and then we added a comma which bumped it out by one.

Maybe we should replace a single space with a comma, but if the comment only has a single leading space, preserve that space?

TODO: https://github.com/google/keep-sorted/issues/33 - Fix this
keep-sorted-test start
"c",
"b, //",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block=yes has some logic that looks at quotes and tries to understand whether we're in a quoted section or not:

if cb.expectedQuote == "" {

I wonder if we could leverage that here to exclude comment leaders that are actually part of a string constant


Per my other comment, that code should probably use StickyPrefixes instead of just commentMarker, but that's again me just thinking out loud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants