Skip to content

Conversation

@svlv
Copy link

@svlv svlv commented Jan 14, 2026

Line-buffered stdout causes partial write and read operations in dd,
which is an issue when writing binary data to stdout. Partial writes can
lead to data loss and require passing iflag=fullblock to ensure that the
exact number of bytes is read.

It fixes partial writes mentioned in the PR #8840 and the issue #9119

Partial writes can be reproduced by the following calls

dd bs=1024 if=/dev/random count=1 | dd bs=1024 of=/dev/null count=1
1+0 records in
1+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000333775 s, 1.0 MB/s
0+1 records in
0+1 records out
678 bytes copied, 0.00151917 s, 678 kB/s

dd bs=1024 if=/dev/random count=1 | dd bs=1024 of=/dev/null count=1
1+0 records in
1+0 records out
0+1 records in
0+1 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000438309 s, 1.0 MB/s
677 bytes copied, 0.00130174 s, 677 kB/s

dd bs=1024 if=/dev/random count=1 | dd bs=1024 of=/dev/null count=1
0+1 records in
0+1 records out
904 bytes copied, 0.00121703 s, 904 kB/s
1+0 records in
1+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000428462 s, 1.0 MB/s Error flushing stdout: Broken pipe (os error 32)

As you can see, each call copies a different number of bytes, and in some cases a broken pipe can occur. The broken pipe happens when the second dd closes the pipe too early.

The issue with line-buffered stdout is well-known:
https://ericswpark.com/blog/2025/2025-01-23-buffering-by-block-in-rust/

For the fix I was using the workaround shared here - rust-lang/rust#58326 (comment)

This is almost my first snippet in Rust, so please don’t judge too strictly.

svlv and others added 2 commits January 14, 2026 13:40
Line-buffered stdout causes partial write and read operations in dd,
which is an issue when writing binary data to stdout. Partial writes can
lead to data loss and require passing iflag=fullblock to ensure that the
exact number of bytes is read.
@sylvestre sylvestre requested a review from ChrisDryden January 14, 2026 22:33
@ChrisDryden
Copy link
Collaborator

I believe we have a shared library that handles this behavior: OwnedFileDescriptorOrHandle. My understanding is that treating the Fd as a file even if its Stdout will also solve the issue of bypassing using the LineWriter.

There is the performance tradeoff of cloning the Fd, but it also means we can avoid the unsafe and it wont close stdout on drop.

I did a check to see if OwnedFileDescriptorOrHandle works and using the examples you provided it seems to solve the issue you are describing

not(target_os = "freebsd"),
feature = "printf"
))]
fn test_no_dropped_writes() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind using the test macros that we have to make this less verbose:

      const BLK_SIZE: usize = 0x4000;
      const COUNT: usize = 1000;
      const NUM_BYTES: usize = BLK_SIZE * COUNT;

      let result = new_ucmd!()
          .args(&[
              "if=/dev/urandom",
              &format!("bs={BLK_SIZE}"),
              &format!("count={COUNT}"),
          ])
          .succeeds();

      assert_eq!(result.stdout().len(), NUM_BYTES);
      assert!(result.stderr_str().contains(&format!("{NUM_BYTES} bytes")));

unix,
not(target_os = "macos"),
not(target_os = "freebsd"),
feature = "printf"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this is here, is this an artifact from another test?

let raw_fd = io::stdout().as_raw_fd();
unsafe { File::from_raw_fd(raw_fd) }
};
let mut dst = Dest::Stdout(f);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To give more of an example of what I was thinking of the shared library that is used to bypass the LineWriter

let fx = OwnedFileDescriptorOrHandle::from(io::stdout())?;
let mut dst = Dest::Stdout(fx.into_file());

OwnedFileDescriptorOrHandle can be used to bypass the LineWriter
that is used by default for Stdout.
@svlv
Copy link
Author

svlv commented Jan 15, 2026

Thanks for the good points. I’ve applied the comments.

@codspeed-hq
Copy link

codspeed-hq bot commented Jan 15, 2026

Merging this PR will degrade performance by 27.39%

⚡ 1 improved benchmark
❌ 7 regressed benchmarks
✅ 274 untouched benchmarks
⏩ 38 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory sort_unique_locale[500000] 33.6 MB 39.8 MB -15.5%
Memory sort_long_line[160000] 712.6 KB 981.4 KB -27.39%
Memory sort_key_field[500000] 47.8 MB 51.8 MB -7.62%
Memory sort_mixed_data[500000] 22.9 MB 27.1 MB -15.71%
Memory sort_numeric[500000] 75.5 MB 79.2 MB -4.67%
Memory sort_accented_data[500000] 22.1 MB 28.3 MB -21.79%
Memory sort_ascii_utf8_locale 6.7 MB 6.2 MB +7.93%
Memory sort_ascii_only[500000] 22.2 MB 28.3 MB -21.77%

Comparing svlv:dd-get-rid-of-line-buffered-stdout (2a60959) with main (2c75e71)

Open in CodSpeed

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@ChrisDryden
Copy link
Collaborator

Those memory tests were just added today, can ignore them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants