Skip to content

Improve take performance on List arrays#9643

Open
AdamGS wants to merge 1 commit intoapache:mainfrom
AdamGS:adamg/list-take-perf-improvement
Open

Improve take performance on List arrays#9643
AdamGS wants to merge 1 commit intoapache:mainfrom
AdamGS:adamg/list-take-perf-improvement

Conversation

@AdamGS
Copy link
Copy Markdown
Contributor

@AdamGS AdamGS commented Apr 1, 2026

Which issue does this PR close?

  • Closes #NNN.

Rationale for this change

This PR builds on top of #9626, improving the results on those benchmarks.

What changes are included in this PR?

  1. Similar to Improve take_bytes perf in the null cases between 10-25% #9625, branch the function into the null and non-null paths
  2. Copy the list elements in a single pass while building the offsets, allocating less intermediate state.

Are these changes tested?

Added a few tests for sliced list arrays.

Are there any user-facing changes?

No

Signed-off-by: Adam Gutglick <adam@spiraldb.com>
@AdamGS
Copy link
Copy Markdown
Contributor Author

AdamGS commented Apr 1, 2026

Results on the benchmarks in #9626:

take list i32 512       time:   [4.4872 µs 4.5048 µs 4.5246 µs]
                        change: [−12.029% −11.670% −11.245%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

Benchmarking take list i32 1024: Collecting 100 samples in estimated 5.0193 s (571k iterattake list i32 1024      time:   [8.1540 µs 8.1715 µs 8.1891 µs]
                        change: [−24.814% −22.002% −19.215%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

Benchmarking take list i32 null values 1024: Collecting 100 samples in estimated 5.0033 s take list i32 null values 1024
                        time:   [5.5799 µs 5.6028 µs 5.6273 µs]
                        change: [−11.178% −4.1193% +8.6975%] (p = 0.67 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Benchmarking take list i32 null indices 1024: Collecting 100 samples in estimated 5.0048 stake list i32 null indices 1024
                        time:   [7.9070 µs 7.9327 µs 7.9632 µs]
                        change: [−80.594% −80.504% −80.409%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Benchmarking take list i32 null values null indices 1024: Collecting 100 samples in estimatake list i32 null values null indices 1024
                        time:   [5.3172 µs 5.3387 µs 5.3660 µs]
                        change: [−14.330% −13.956% −13.587%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant