-
Notifications
You must be signed in to change notification settings - Fork 1.2k
metal: prevent command buffer exhaustion deadlocks #8574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
metal: prevent command buffer exhaustion deadlocks #8574
Conversation
- Track outstanding Metal command buffers per queue and gate begin_encoding against the hard MAX_COMMAND_BUFFERS, returning device-lost with an actionable warning instead of letting new_command_buffer hang when encoders are leaked. - Share the counter across queue/encoders and decrement on submit or discard after clearing the raw command buffer so drop happens before the bookkeeping update. Fixes gfx-rs#3084, gfx-rs#8047.
|
Using the test from #8047 with the minimal changes to compile against The example for the problem now shows this instead of deadlocking: So you don't get a clear indication of what happened unless you have the logs on... |
andyleiserson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, Firefox has the same concern.
Another test case is cargo xtask cts 'webgpu:api,validation,resource_usages,texture,in_render_common:*'.
| // Tracks command buffers created via `CommandEncoder::begin_encoding` that | ||
| // have not yet been submitted or discarded. Used to proactively fail | ||
| // before hitting Metal's `maxCommandBufferCount`. | ||
| command_buffer_created_not_submitted: Arc<atomic::AtomicU64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An observation, but not important: it might be possible to use Arc<()> and rely on Arc::strong_count to indicate the number of outstanding command buffers. Although there is an issue of the difference between when the command encoder is created and when begin_encoding is called.
Connections
Fixes #3084, #8047.
Description
Testing
Tests pass. I've run the test from #8047 and this code triggers.
Squash or Rebase?
Squash
Checklist
cargo fmt.taplo format.cargo clippy --tests. If applicable, add:--target wasm32-unknown-unknowncargo xtask testto run tests.CHANGELOG.mdentry.