Fix indexedarray adjoint by aneangel · Pull Request #1504 · NVIDIA/warp

aneangel · 2026-06-02T01:20:44Z

Description

wp.indexedarray inputs were not differentiable: wp.Tape raised AttributeError
(no .grad), a manual adjoint launch segfaulted on CPU, and on CUDA it silently
produced zero gradients. This adds gradient support for indexed-array inputs so the
adjoint follows the gather indirection and accumulates into the base array's gradient.

right now this is only handling 1-D arrays, multi-dim is still needing to be implemented

closes #1479

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

Test plan

uv run warp/tests/test_indexedarray.py
uv run warp/tests/test_indexedarray.py -k grad_1d

Verified gradients are correct on CPU and CUDA via both wp.Tape and a manual
adjoint launch (expected [0, 0.25, 0, 0.5, 0, 1.0]).

Bug fix

import warp as wp
import numpy as np

@wp.kernel
def weighted_sample_sum(samples: wp.indexedarray[float], weights: wp.array[float], total: wp.array[float]):
    i = wp.tid()
    wp.atomic_add(total, 0, samples[i] * weights[i])

base = wp.array(np.linspace(1, 6, 6, dtype=np.float32), dtype=float, requires_grad=True)
weights = wp.array([0.25, 0.5, 1.0], dtype=float)
samples = wp.indexedarray1d(base, [wp.array([1, 3, 5], dtype=wp.int32)])
total = wp.zeros(1, dtype=float, requires_grad=True)

tape = wp.Tape()
with tape:
    wp.launch(weighted_sample_sum, dim=samples.size, inputs=[samples, weights], outputs=[total])
tape.backward(loss=total)
print(base.grad.numpy())   # without this PR: AttributeError / zeros / segfault

Summary by CodeRabbit

New Features
- Improved gradient support for indexed arrays in automatic differentiation; gradients now properly propagate through indexed array operations.
Tests
- Added tests validating gradient computation through indexed array inputs.

copy-pr-bot · 2026-06-02T01:20:47Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-02T01:20:56Z

📝 Walkthrough

Walkthrough

This PR fixes adjoint gradient computation through indexedarray inputs by adding gradient view support to indexed arrays, implementing index-remapping adjoint address resolution in the native layer, updating kernel argument validation in adjoint mode, and adding comprehensive gradient propagation tests.

Changes

Indexedarray Adjoint Gradients

Layer / File(s)	Summary
Indexedarray gradient view support `warp/_src/types.py`	`indexedarray` class gains a `grad` property that returns an indexed view into `self.data.grad` using the same indices (up to `self.ndim`), or `None` if the base data or its gradient is missing.
Adjoint address resolution for indexed arrays `warp/native/array.h`	Two `adj_address` template overloads added for `indexedarray_t<T>` that resolve first-dimension index indirection via `buf.indices[0]` (handling negative indices), then route gradient accumulation to either an indexed adjoint buffer or the base array's embedded gradient.
Adjoint mode array validation `warp/_src/context.py`	`pack_arg()` adjoint-mode validation relaxed to accept both `isinstance(value, warp.array)` and values whose concrete type exactly matches `concrete_array_type(arg_type)`, enabling indexed gradient views to pass kernel argument checks.
Gradient propagation test `warp/tests/test_indexedarray.py`	New kernel `kernel_indexedarray_grad_1d` and test `test_indexedarray_grad_1d` verify that gradients correctly propagate through gather-weighted reduction via `indexedarray1d`, with validation that backward pass accumulates gradients into the base array at gathered indices.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: implementing gradient/adjoint support for indexedarray, which is the primary objective of the PR.
Linked Issues check	✅ Passed	The PR fully implements the coding requirements from issue `#1479`: relaxed adjoint mode validation, added grad property to indexedarray, implemented adjoint address resolution for indexedarray, and provided tests validating correct gradient propagation.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing indexedarray adjoint behavior: validation relaxation, grad property implementation, adjoint address overloads, and corresponding tests.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

warp/native/array.h (1)
1321-1324: ⚡ Quick win

Consider adding a bounds check after index remapping.

After remapping the index through buf.indices[0], the forward pass verifies the result is within bounds (line 619). The adjoint pass should include a similar check to guard against invalid indices arrays:
if (buf.indices[0])
    i = buf.indices[0][i];
assert(i >= 0 && i < buf.arr.shape[0]);
While existing adjoint functions omit assertions for performance, validating remapped indices would catch external data corruption early.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@warp/native/array.h` around lines 1321 - 1324, Add a bounds check after
remapping the index via buf.indices[0] in the adjoint pass: after assigning i =
buf.indices[0][i] verify that i is within [0, buf.arr.shape[0]) and handle or
assert on failure. Specifically update the block that currently does "if
(buf.indices[0]) i = buf.indices[0][i];" to perform the bounds validation using
buf.arr.shape[0] (and buf.indices[0] and i) so corrupted or out-of-range index
mappings are caught early.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@warp/native/array.h`:
- Around line 1321-1324: Add a bounds check after remapping the index via
buf.indices[0] in the adjoint pass: after assigning i = buf.indices[0][i] verify
that i is within [0, buf.arr.shape[0]) and handle or assert on failure.
Specifically update the block that currently does "if (buf.indices[0]) i =
buf.indices[0][i];" to perform the bounds validation using buf.arr.shape[0] (and
buf.indices[0] and i) so corrupted or out-of-range index mappings are caught
early.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Enterprise

Run ID: b2df894d-50f5-4f0f-b39b-6910f9539deb

📥 Commits

Reviewing files that changed from the base of the PR and between 81f95f8 and 6807f3a.

📒 Files selected for processing (4)

warp/_src/context.py
warp/_src/types.py
warp/native/array.h
warp/tests/test_indexedarray.py

greptile-apps · 2026-06-02T01:30:41Z

Greptile Summary

This PR fixes a long-standing bug (GH-1479) where wp.indexedarray inputs were not differentiable: wp.Tape raised AttributeError (no .grad), manual adjoint launches segfaulted on CPU, and CUDA silently produced zero gradients. The fix adds a grad property to indexedarray that returns an indexed view over the base array's gradient, updates pack_arg to accept either a plain warp.array or an indexedarray as an adjoint argument, and adds two new C++ adj_address overloads that resolve the gather indirection before accumulating into the base gradient.

Confidence Score: 4/5

Safe to merge for 1-D indexed array differentiation; multi-dim will fail at C++ compile time rather than produce wrong gradients.

The 1-D implementation is logically consistent across Python and C++: the grad property, the pack_arg relaxation, and both adj_address overloads correctly follow gather indirection before accumulating into base.grad. The main gap is that indexedarray.grad is accessible for ndim > 1 but C++ templates are absent, so multi-dim differentiation fails at kernel compilation with an unhelpful error. Test coverage is good for the happy path but misses tape.zero() and negative-index scenarios.

warp/_src/types.py — the grad property should guard against ndim > 1 to prevent opaque C++ template failures.

Important Files Changed

Filename	Overview
warp/_src/types.py	Adds `grad` property to `indexedarray`; correct for 1D, but no guard prevents multi-dim usage that lacks C++ support.
warp/native/array.h	Two new `adj_address` overloads for `indexedarray_t` correctly resolve 1D gather indirection; only 1D is supported (multi-dim overloads absent by design).
warp/_src/context.py	Minimal, correct relaxation of adjoint type-matching to accept `indexedarray` alongside plain `warp.array`.
warp/tests/test_indexedarray.py	New `test_indexedarray_grad_1d` validates forward values and gradient at indexed positions; lacks `tape.zero()` verification and negative-index coverage.

_{Reviews (1): Last reviewed commit: "1-D unit test as example" | Re-trigger Greptile}

greptile-apps · 2026-06-02T01:30:45Z

+    def grad(self):
+        if self.data is None or self.data.grad is None:
+            return None
+        return indexedarray(self.data.grad, self.indices[: self.ndim], ndim=self.ndim)


grad property silently permits ndim > 1 without C++ support

The property creates a valid-looking indexedarray for any ndim, but the C++ adj_address overloads added in this PR only exist for the 1-D signature. When a 2-D+ indexed array is differentiated, the generated adjoint kernel calls adj_address(indexedarray_t, i, j, ...), which has no matching template, producing an opaque C++ compilation error at kernel-launch time. Adding a guard here would surface a clear, actionable Python error instead.

greptile-apps · 2026-06-02T01:30:46Z

+// indexedarray with a regular-array adjoint (as passed by the CUDA codegen): resolve the
+// index indirection, then accumulate into the base grad or the base array's embedded grad
+template <typename T>
+inline CUDA_CALLABLE void
+adj_address(const indexedarray_t<T>& buf, int i, const array_t<T>& adj_buf, int adj_i, const T& adj_output)
+{
+    if (i < 0)
+        i += buf.shape[0];
+    if (buf.indices[0])
+        i = buf.indices[0][i];
+
+    if (adj_buf.data)
+        adj_atomic_add(&index(adj_buf, i), adj_output);
+    else if (buf.arr.grad)
+        adj_atomic_add(&index_grad(buf.arr, i), adj_output);
+}


Second overload's triggering path is underdocumented

The comment says "as passed by the CUDA codegen", but after this PR the tape backward always passes an indexedarray_t (not array_t) as the adjoint via the new grad property. The array_t adjoint overload is actually needed for the manual adjoint-launch use case (e.g., wp.launch(..., adj_inputs=[plain_array], adjoint=True)), which was the path that segfaulted on CPU. Clarifying this in the comment would prevent confusion about which code path actually exercises this overload.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-02T01:30:47Z

+def test_indexedarray_grad_1d(test, device):
+    # gradients must flow back through a differentiable indexedarray input (GH-1479):
+    # the adjoint follows the gather indirection and accumulates into the base array's grad
+    base = wp.array(np.linspace(1.0, 6.0, 6, dtype=np.float32), dtype=float, device=device, requires_grad=True)
+    weights_np = np.array([0.25, 0.5, 1.0], dtype=np.float32)
+    weights = wp.array(weights_np, dtype=float, device=device)
+    indices = wp.array([1, 3, 5], dtype=int, device=device)
+    samples = wp.indexedarray1d(base, [indices])
+    total = wp.zeros(1, dtype=float, device=device, requires_grad=True)
+
+    tape = wp.Tape()
+    with tape:
+        wp.launch(
+            kernel_indexedarray_grad_1d, dim=samples.size, inputs=[samples, weights], outputs=[total], device=device
+        )
+
+    # forward: sum of base[i] * weight over the gathered indices
+    assert_np_equal(total.numpy(), np.array([8.5], dtype=np.float32), tol=1e-6)
+
+    tape.backward(loss=total)
+
+    # d(total)/d(base[j]) is the matching weight at each gathered index, zero elsewhere
+    expected = np.zeros(6, dtype=np.float32)
+    expected[[1, 3, 5]] = weights_np
+    assert_np_equal(base.grad.numpy(), expected, tol=1e-6)


Missing tape.zero() and negative-index coverage

Two behaviors introduced by this PR are not exercised: (1) tape.zero() now calls samples.grad.zero_(), which is an indexed fill_(0) over base.grad — a regression here would leave stale gradients across backward passes; (2) the C++ code handles negative index wrap-around (if (i < 0) i += buf.shape[0]), but there is no test that passes a negative index through the kernel and verifies gradient accumulation at the correct base-array slot.

aneangel added 2 commits June 1, 2026 18:10

update: a 1-D example of possible fix

6951a98

1-D unit test as example

6807f3a

aneangel mentioned this pull request Jun 2, 2026

Fix indexedarray adjoint gradients #1479

Open

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix indexedarray adjoint#1504

Fix indexedarray adjoint#1504
aneangel wants to merge 2 commits into
NVIDIA:mainfrom
aneangel:aangeles/fix-indexedarray-adjoint

aneangel commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

greptile-apps Bot commented Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

aneangel commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Test plan

Bug fix

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aneangel commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading