Skip to content

Fix UTF-32 encoding in global state#4063

Open
vinistock wants to merge 1 commit intorubydex_adoption_feature_branchfrom
vs_fix_utf_32_encoding
Open

Fix UTF-32 encoding in global state#4063
vinistock wants to merge 1 commit intorubydex_adoption_feature_branchfrom
vs_fix_utf_32_encoding

Conversation

@vinistock
Copy link
Copy Markdown
Member

Motivation

Encoding::UTF_32 doesn't actually encodes strings correctly and we were calculating code points with a wrong encoding. It's UTF_32LE that encodes the string and provides us the right locations back.

Implementation

Started using Encoding::UTF_32LE, which applies the correct encoding to the string. It's easy to verify that it was previously wrong by doing this:

# The length should be based on codepoints (42)

# Using the wrong encoding
"class Foo; end\n\"🙂\"; Foo\n".encode(Encoding::UTF_32).length
# => 100

# Using the right encoding
"class Foo; end\n\"🙂\"; Foo\n".encode(Encoding::UTF_32LE).length
# => 24

Automated Tests

Added a test.

Encoding::UTF_32 doesn't actually encodes strings
correctly and we were calculating code points with
a wrong encoding. It's UTF_32LE that encodes the
string and provides us the right locations back
@vinistock vinistock self-assigned this Apr 16, 2026
@vinistock vinistock requested a review from a team as a code owner April 16, 2026 21:26
@vinistock vinistock added bugfix This PR will fix an existing bug server This pull request should be included in the server gem's release notes labels Apr 16, 2026
@vinistock vinistock requested review from alexcrocha and st0012 April 16, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix This PR will fix an existing bug server This pull request should be included in the server gem's release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants