Skip to content

cuda::std::simd core functionalities#8251

Open
fbusato wants to merge 18 commits intoNVIDIA:mainfrom
fbusato:simd-core
Open

cuda::std::simd core functionalities#8251
fbusato wants to merge 18 commits intoNVIDIA:mainfrom
fbusato:simd-core

Conversation

@fbusato
Copy link
Copy Markdown
Contributor

@fbusato fbusato commented Mar 31, 2026

Address #30

Description

Implement the core functionalities of C++26 std::simd, reference https://eel.is/c++draft/simd

The PR introduces the minimal set of functionalities that are self-consistent.
The main goal is to provide a C++26-compliant implementation, no specific optimizations have been introduced.

List of features:

  • cuda::std::simd namespace and main header.
  • ABI tags: simd_abi::fixed_size_simple, simd_abi::native,
  • Type traits: alignment, rebind, resize.
  • Exposition-only concepts and utilities.
  • Flag types: flags, flag_default, flag_convert, flag_aligned, flag_overaligned.
  • basic_vec and basic_mask classes.
  • fixed_size_simple (custom ABI) specialization.

To do:

  • unit test.
  • documentation.

@fbusato fbusato self-assigned this Mar 31, 2026
@fbusato fbusato requested a review from a team as a code owner March 31, 2026 23:04
@fbusato fbusato added the libcu++ For all items related to libcu++ label Mar 31, 2026
@fbusato fbusato requested a review from ericniebler March 31, 2026 23:04
@fbusato fbusato added this to CCCL Mar 31, 2026
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 31, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Mar 31, 2026
@fbusato fbusato moved this from In Review to In Progress in CCCL Mar 31, 2026
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@fbusato fbusato requested a review from miscco April 8, 2026 18:20
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

😬 CI Workflow Results

🟥 Finished in 1h 18m: Pass: 46%/99 | Total: 20h 15m | Max: 59m 11s | Hits: 94%/35139

See results here.

_CCCL_PRAGMA_UNROLL_FULL()
for (__simd_size_type __i = 0; __i < _Np; ++__i)
{
__result.__data[__i] = __lhs.__data[__i] && __rhs.__data[__i];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: This should be bitwise and

Suggested change
__result.__data[__i] = __lhs.__data[__i] && __rhs.__data[__i];
__result.__data[__i] = __lhs.__data[__i] & __rhs.__data[__i];

_CCCL_PRAGMA_UNROLL_FULL()
for (__simd_size_type __i = 0; __i < _Np; ++__i)
{
__result.__data[__i] = __lhs.__data[__i] || __rhs.__data[__i];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: This should be bitwise or

Suggested change
__result.__data[__i] = __lhs.__data[__i] || __rhs.__data[__i];
__result.__data[__i] = __lhs.__data[__i] | __rhs.__data[__i];

_CCCL_PRAGMA_UNROLL_FULL()
for (__simd_size_type __i = 0; __i < _Np; ++__i)
{
__result.__data[__i] = __lhs.__data[__i] != __rhs.__data[__i];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: This should be bitwise xor

Suggested change
__result.__data[__i] = __lhs.__data[__i] != __rhs.__data[__i];
__result.__data[__i] = __lhs.__data[__i] ^ __rhs.__data[__i];


[[nodiscard]] _CCCL_API static constexpr __simd_size_type __min_index(const _MaskStorage& __s) noexcept
{
_CCCL_PRAGMA_UNROLL_FULL()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Should we add an assert?

Comment on lines +89 to +99
[[nodiscard]] _CCCL_API static constexpr _MaskStorage
__logic_and(const _MaskStorage& __lhs, const _MaskStorage& __rhs) noexcept
{
_MaskStorage __result{};
_CCCL_PRAGMA_UNROLL_FULL()
for (__simd_size_type __i = 0; __i < _Np; ++__i)
{
__result.__data[__i] = __lhs.__data[__i] && __rhs.__data[__i];
}
return __result;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I believe all of those could just be

Suggested change
[[nodiscard]] _CCCL_API static constexpr _MaskStorage
__logic_and(const _MaskStorage& __lhs, const _MaskStorage& __rhs) noexcept
{
_MaskStorage __result{};
_CCCL_PRAGMA_UNROLL_FULL()
for (__simd_size_type __i = 0; __i < _Np; ++__i)
{
__result.__data[__i] = __lhs.__data[__i] && __rhs.__data[__i];
}
return __result;
}
template <__simd_size_type... _Is>
[[nodiscard]] _CCCL_API static constexpr _MaskStorage
__logic_and(const _MaskStorage& __lhs, const _MaskStorage& __rhs, integer_sequence<__simd_size_type, _Is...> = {}) noexcept
{
return _MaskStorage{(__result{__lhs.__data[_Is] && __rhs.__data[_Is])...};
}

{
static constexpr size_t __element_bytes = _Bytes;

bool __data[_Np]{}; // initialization required for constexpr constructor
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be really awesome, if we could see whether we can get around this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libcu++ For all items related to libcu++

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants