Fix audio buffer memory leak: use in-place del instead of slice-copy#228
Fix audio buffer memory leak: use in-place del instead of slice-copy#228
Conversation
bytearray slice assignment (self._buffer = self._buffer[n:]) creates a new bytearray every 20ms. At 48kHz mono s16, that's 50 allocations per second, each copying the remaining buffer. pymalloc retains these in arena pools and never returns the memory to the OS. del self._buffer[:n] removes data in-place without allocating a new object. Memray profiling showed this single line accounted for 900MB of allocations per call.
📝 WalkthroughWalkthroughReplaces buffer re-slicing with in-place deletion of consumed or overflow bytes in the audio track implementation; adds tests verifying that Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Assert that recv() and overflow trim modify the same bytearray object rather than creating a new one via slice-copy.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test_audio_stream_track.py (1)
306-355: Use fixtures forAudioStreamTracksetup in these new tests.Both new tests duplicate track setup and direct object construction. Consider a fixture for the mono/s16 track variants used here.
As per coding guidelines, "Use fixtures to inject objects in tests; test client fixtures can use the Stream API client".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_audio_stream_track.py` around lines 306 - 355, The two tests test_recv_does_not_reallocate_buffer and test_buffer_overflow_does_not_reallocate duplicate AudioStreamTrack construction; extract a pytest fixture (e.g., mono_s16_track) that yields an AudioStreamTrack configured for mono s16 (sample_rate=48000, channels=1, format="s16"), and update test_recv_does_not_reallocate_buffer to accept that fixture instead of constructing track locally; for test_buffer_overflow_does_not_reallocate add a separate fixture or a parameterized fixture (e.g., mono_s16_track_with_buffer_size or use request param) to supply audio_buffer_size_ms=100 while keeping the same fixture pattern, then remove direct constructions and use the fixtures in both tests. Ensure fixture names match the test parameters and yield a ready-to-use track for write/recv operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_audio_stream_track.py`:
- Around line 330-358: The test test_buffer_overflow_does_not_reallocate only
checks identity of track._buffer, which can pass even if trimming didn't occur;
after writing pcm_large, also assert the buffer length equals the 100ms cap
(compute as sample_rate * channels * audio_buffer_size_ms / 1000) to prove the
overflow path executed — e.g., for sample_rate=48000, channels=1,
audio_buffer_size_ms=100 the expected length is 4800 samples; add this length
assertion immediately after the second await track.write(pcm_large) alongside
the existing id(track._buffer) check.
---
Nitpick comments:
In `@tests/test_audio_stream_track.py`:
- Around line 306-355: The two tests test_recv_does_not_reallocate_buffer and
test_buffer_overflow_does_not_reallocate duplicate AudioStreamTrack
construction; extract a pytest fixture (e.g., mono_s16_track) that yields an
AudioStreamTrack configured for mono s16 (sample_rate=48000, channels=1,
format="s16"), and update test_recv_does_not_reallocate_buffer to accept that
fixture instead of constructing track locally; for
test_buffer_overflow_does_not_reallocate add a separate fixture or a
parameterized fixture (e.g., mono_s16_track_with_buffer_size or use request
param) to supply audio_buffer_size_ms=100 while keeping the same fixture
pattern, then remove direct constructions and use the fixtures in both tests.
Ensure fixture names match the test parameters and yield a ready-to-use track
for write/recv operations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 86bd7fce-5ac8-4b9c-b080-e74f924e7e53
📒 Files selected for processing (1)
tests/test_audio_stream_track.py
| async def test_buffer_overflow_does_not_reallocate(self): | ||
| """Test that buffer overflow trims in-place without creating a new buffer object.""" | ||
| track = AudioStreamTrack( | ||
| sample_rate=48000, channels=1, format="s16", audio_buffer_size_ms=100 | ||
| ) | ||
|
|
||
| # Write 50ms of data first to get a buffer reference | ||
| samples_50ms = np.zeros(2400, dtype=np.int16) | ||
| pcm = PcmData( | ||
| samples=samples_50ms, | ||
| sample_rate=48000, | ||
| format=AudioFormat.S16, | ||
| channels=1, | ||
| ) | ||
| await track.write(pcm) | ||
| buffer_id = id(track._buffer) | ||
|
|
||
| # Write 200ms of data (exceeds 100ms limit, triggers overflow trim) | ||
| samples_200ms = np.zeros(9600, dtype=np.int16) | ||
| pcm_large = PcmData( | ||
| samples=samples_200ms, | ||
| sample_rate=48000, | ||
| format=AudioFormat.S16, | ||
| channels=1, | ||
| ) | ||
| await track.write(pcm_large) | ||
|
|
||
| assert id(track._buffer) == buffer_id, "overflow trim should modify buffer in-place, not create a new one" | ||
|
|
There was a problem hiding this comment.
Assert trimmed length to prove overflow path executed.
At Line 357, identity-only validation can pass even if overflow trimming fails but _buffer is still the same object. Add a length assertion for the 100ms cap.
Suggested test hardening
await track.write(pcm_large)
assert id(track._buffer) == buffer_id, "overflow trim should modify buffer in-place, not create a new one"
+ expected_max_bytes = int(0.1 * 48000) * 2 # 100ms * sample_rate * bytes_per_sample
+ assert len(track._buffer) == expected_max_bytes, "buffer should be trimmed to configured max size"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_audio_stream_track.py` around lines 330 - 358, The test
test_buffer_overflow_does_not_reallocate only checks identity of track._buffer,
which can pass even if trimming didn't occur; after writing pcm_large, also
assert the buffer length equals the 100ms cap (compute as sample_rate * channels
* audio_buffer_size_ms / 1000) to prove the overflow path executed — e.g., for
sample_rate=48000, channels=1, audio_buffer_size_ms=100 the expected length is
4800 samples; add this length assertion immediately after the second await
track.write(pcm_large) alongside the existing id(track._buffer) check.
Why
pymalloc allocates memory for small objects in large blocks (1MB arenas) to reduce syscalls. An arena can only be returned to the OS when all objects inside it are dead.
AudioStreamTrack.recv()runs every 20ms (50x/sec) and slices the buffer:But "freed" means freed inside pymalloc, not returned to the OS. The old object lived in an arena alongside other objects. If any of those neighbors are still alive, the entire arena (1MB) stays reserved.
At 50 allocations/sec, arenas become fragmented: each one has a mix of live and dead blocks, so none of them can be fully released. RSS grows with every call and never comes back.
The fix uses in-place deletion:
This calls
memmoveto shift data within the same buffer. No new object, no malloc, no GC, no arena fragmentation.Changes
self._buffer = self._buffer[n:]withdel self._buffer[:n]in two places (line 133 and 195)Memray profiling results (single call, 48kHz mono s16)
Summary by CodeRabbit
Refactor
Tests