Skip to content

ioroDevice issue on multigpu system#135

Open
jpola-amd wants to merge 1 commit intoGPUOpen-LibrariesAndSDKs:mainfrom
jpola-amd:dev/ioroDevice-type-punning
Open

ioroDevice issue on multigpu system#135
jpola-amd wants to merge 1 commit intoGPUOpen-LibrariesAndSDKs:mainfrom
jpola-amd:dev/ioroDevice-type-punning

Conversation

@jpola-amd
Copy link
Copy Markdown

Fix UB type-punning in ioroDevice and packed device handle boundaries

  • Replace the ioroDevice bitfield struct with a single oroU32 m_value representation. The old constructor wrote through an int* alias over bitfield storage, which is undefined behaviour under the strict aliasing rule and also relied on unspecified bitfield layout.

  • The old oroSetRawDevice additionally misinterpreted a raw backend device index as an already-packed Orochi handle, causing incorrect results for any device index with bits overlapping the API field (e.g. oroDevice = 1). This appears when testing HIPRT with HIP_VISIBLE_DEVICES="1".

Changes:

  • ioroDevice now stores a single oroU32; API and device index are packed/unpacked with explicit masks and shifts. API field widened from 4 to 5 bits to cover the full oroApi enum including ORO_API_CUDARTC = 1 << 4.
  • oroSetRawDevice packs from zero, so the raw device index is written into the correct field rather than being misread as packed data.
  • All oroDevice boundary sites (oroDeviceGet, oroGetRawDevice, oroGetDeviceProperties, oroDeviceGetName, oroDeviceGetAttribute, oroCtxCreate) replace *(ioroDevice*)device = d and *(oroDevice*)&d casts with plain (oroDevice)d.packed() / ioroDevice((oroU32)dev) conversions.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes undefined behavior and incorrect packing/unpacking of oroDevice handles on multi-GPU systems by replacing the previous bitfield + type-punning approach with an explicit 32-bit packed representation.

Changes:

  • Replaced ioroDevice bitfields with a single oroU32 m_value plus explicit masks/shifts for API + device index packing.
  • Fixed oroSetRawDevice to pack a raw backend device index into the correct bit range (instead of treating it as already-packed data).
  • Removed pointer reinterpret-casts at oroDevice boundaries and replaced them with value-based conversions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants