[runtime/mppa] Add Support for Mppa PMU counters#53
Open
ElectrikSpace wants to merge 6 commits intoxtc-tools:mainfrom
Open
[runtime/mppa] Add Support for Mppa PMU counters#53ElectrikSpace wants to merge 6 commits intoxtc-tools:mainfrom
ElectrikSpace wants to merge 6 commits intoxtc-tools:mainfrom
Conversation
9519cd9 to
a167404
Compare
a167404 to
d35ff09
Compare
This goal of the work on the runtime is to facilitate the future
integration of non-host targets. New features are also added, especially
for accelerators.
Create common interfaces that runtimes will need to implement.
This patch introduces a runtime base named CommonRuntimeInterface which
is common among three derives classes of runtimes:
- Host (no derived class for now)
- AcceleratorDevice for accelerators. Instance of this class are
called devices.
- EmbeddedDevice for external embedded processors.
Instances of AcceleratorDevice and EmbeddedDevice are called devices.
Unlike the current lazy runtime resolution, the concept of devices will
allow to handle multiples accelerator of the same class.
Apply the new runtime interfaces on the two existing runtimes: - Create HostRuntime singleton class that derives from CommonRuntimeInterface - Create GPUDevice singleton (for now) class that derives from AcceleratorDevice to implement the GPU runtime. Some method implementations are shared accross the two, but adding a specific implemention for a particular runtime is now easier. The GPU target has been completely split from the Host target. To prevent code duplication due to a clear split between Host and GPU runtimes, common code portions have been factorized in utils. With this rework of the runtimes, the call path has been simplified, confusing classes like Executor/Evaluator and functions like load_and_evaluate have been removed.
In the context on computation offloaded on a accelerator device, the user can specify where input/output tensors live when the evaluation begins. This allows to simulate weights tensors to be transfered ahead of time. This feature is only supported for the MLIR backend, but setting a "memref.on_device" attribute.
Create a new Mppa compilation target for MLIR. This target is the first one to implement the ahead of time offloading of tensors on device. Create a new Mppa runtime derived from AcceleratorDevice. The runtime can be configured on various aspect (check MppaConfig class). Execution is supported on ISS, Qemu, and Hardware. Note: In order to use the Mppa target, the Kalray Core Toolchain must be installed. mlir_sdist and mlir_mppa must also be installed.
Rely on the kvxuks-catch pass to catch micro-kernels in replacement of the transform dialect based vectorization.
Add support for Mppa specific counters in the existing interface for PMU counters. These counters are extracted from the list of all counters, and processed separately. All counters are available in each KVX PE, so some post-processing is required to reduce the raw measured values. Several reduction functions are available, on both the PEs and clusters: - avg: average - min: minimum - max: maximum - sum: sum - <id>: select one specific cluster/PE id Host counters are still working. Only one csrcs perf_event function has been overridden, to enable Mppa counters by reseting previous values
d35ff09 to
ee238e4
Compare
guillon
approved these changes
Mar 3, 2026
Member
guillon
left a comment
There was a problem hiding this comment.
Fine for me once previous MR is merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Support Mppa PMU counters with the standard interface for counters in XTC.
DO NOT MERGE FIRST
-> Only the last commit contains the feature, the others are dependencies from: