get_device_context tensor goes stale if heap_bases change after init

## Bug

`get_device_context()` builds a new `torch.tensor` from `self.heap_bases.tolist()` on every call (see #466). Once #466 is fixed by precomputing the tensor in `__init__`, the context tensor will hold a snapshot of heap_bases at construction time.

If `heap_bases` were to change after init (e.g., via `refresh_peer_access()` after a new `shmem.allocate()` or `as_symmetric()` call with a future allocator), the precomputed context tensor would contain stale base addresses. Kernels using `DeviceContext` would translate pointers using wrong bases, causing silent data corruption or hangs.

**Today this is not a bug** — both the torch and vmem allocators produce stable heap_bases after the first `refresh_peer_access()`. But it will become one if an allocator ever remaps peer VA ranges.

## Fix

After precomputing `self._device_context` in `__init__`, add an in-place update in `refresh_peer_access()`:

```python
self._device_context[2:2+self.num_ranks] = self.heap_bases
```

No allocation, CUDAGraph safe, one line.

## Component

`iris/iris.py`, `iris/symmetric_heap.py`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_device_context tensor goes stale if heap_bases change after init #467

Bug

Fix

Component

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

get_device_context tensor goes stale if heap_bases change after init #467

Description

Bug

Fix

Component

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions