-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Fixes #10192 - fix(comm): improve stdout handling and add test for lossy UTF-8 output #10206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
GNU testsuite comparison: |
|
please run cargo fmt |
|
GNU testsuite comparison: |
|
The current CI failure occurs during package installation on ubuntu-latest, prior to running any project-specific steps. |
|
GNU testsuite comparison: |
|
As a side note for other reviewers, I felt that the way that the output was written was repetitive and that we should have made a helper function for that but it turns out that we do not have that implemented yet and there's many utilities that follow that same format. Would be great as a follow up task to make that format into a helper and cleaning up all of the ones that match that pattern. |
Wrap stdout in BufWriter to improve performance and avoid duplicate error messages, matching GNU comm behavior.
CodSpeed Performance ReportMerging this PR will improve performance by 8.42%Comparing Summary
Performance Changes
Footnotes
|
This pull request refactors the output handling in the
commutility to write output directly to a lockedstdouthandle, improving efficiency and error handling, especially for non-UTF-8 input. It also adds a new test to ensure correct output when processing files with invalid UTF-8 sequences.Output handling improvements:
print!macros to writing directly to a lockedstdouthandle viastdout.write_all, allowing for more efficient and robust output, particularly for non-UTF-8 data. All output operations now properly handle errors usingmap_err_context. (src/uu/comm/src/comm.rs) [1] [2] [3] [4] [5] [6]Testing for non-UTF-8 input:
test_output_lossy_utf8, to verify that the utility correctly handles and outputs files containing invalid UTF-8 bytes, matching GNUcomm's behavior. (tests/by-util/test_comm.rs)