Cortex-M: Add depthwise conv2d operator #16233

rascani · 2025-12-12T23:03:52Z

Summary

Add quantized depthwise convolution operator for the Cortex-M backend using CMSIS-NN's optimized arm_depthwise_conv_wrapper_s8 function.

Fixes #16105

Test plan

./backends/cortex_m/test/build_test_runner.sh
pytest --config-file=backends/arm/test/pytest.ini backends/cortex_m/test/ops/test_conv.py

Add quantized depthwise convolution operator for the Cortex-M backend using CMSIS-NN's optimized arm_depthwise_conv_wrapper_s8 function. Key changes: - New op_quantized_depthwise_conv2d.cpp with CMSIS-NN implementation - Python operator registration in operators.py with reference implementation - Operator schema definition in operators.yaml - Updated ConvertToCortexMPass to automatically detect and route depthwise convolutions (where groups == input_channels) to the specialized operator - Comprehensive test coverage with 5 test cases covering different depthwise convolution scenarios (stride, padding, bias, depth multiplier) The implementation validates the depthwise constraint (groups must equal input channels) and supports NHWC layout, int8 quantization, per-channel requantization, and configurable stride/padding/dilation parameters.

…lidations Key changes: - Move depth_multiplier calculation from runtime to AOT pass (eliminates runtime division by computing depth_multiplier = output_channels / input_channels in the graph transformation pass) - Add critical defensive validations in validate_depthwise_conv2d_arguments(): * Validate IHWO weight layout (dimension 0 must be 1) * Validate dilation == 1 (CMSIS-NN constraint) * Validate depth_multiplier consistency with channel counts - Fix CMSIS-NN API usage: * Use arm_depthwise_conv_wrapper_s8_get_buffer_size() with correct parameters * Improve buffer allocation error handling with detailed error messages - Add _compute_depthwise_conv2d_output_shape() to read channels from correct dimension (dim 3 for IHWO layout vs dim 0 for OHWI) - Update operator schema to use depth_multiplier parameter instead of groups This ensures proper validation of CMSIS-NN constraints and moves computation to compile-time where possible.

CMSIS-NN arm_depthwise_conv_wrapper_s8 only supports batch size 1. Add validation in both AOT pass (fail during compilation) and runtime (defensive check). Add 6 test cases covering edge cases: - Combined stride/padding/bias - 1x1 kernels (common in mobile networks) - Higher depth_multiplier (4) - Asymmetric kernels (1x3) - Asymmetric stride/padding - Larger kernels (5x5) Fix depthwise_conv2d_stride test to use batch size 1.

pytorch-bot · 2025-12-12T23:03:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16233

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under '/home/ec2-user/actions-runner/_work/pytorch/pytorch/.github/actions/check-tpu'

✅ You can merge normally! (1 Unrelated Failure)

As of commit d68b40a with merge base c3a53f3 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / android / run-emulator (gh) (#16137)
Timeout waiting for emulator to boot.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

RJ Ascani added 4 commits December 10, 2025 15:52

Fix formatting

e3fcc86

rascani added the release notes: none Do not include this in the release notes label Dec 12, 2025

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cortex-M: Add depthwise conv2d operator #16233

Cortex-M: Add depthwise conv2d operator #16233

rascani commented Dec 12, 2025

Uh oh!

pytorch-bot bot commented Dec 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Cortex-M: Add depthwise conv2d operator #16233

Are you sure you want to change the base?

Cortex-M: Add depthwise conv2d operator #16233

Conversation

rascani commented Dec 12, 2025

Summary

Test plan

Uh oh!

pytorch-bot bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16233

❗ 1 Active SEVs

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Dec 12, 2025 •

edited

Loading