-
Notifications
You must be signed in to change notification settings - Fork 4k
Add reference examples and guidance (especially for multi-device scenarios) for ValidateCompiledModelCompatibilityInfo #29168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2356,6 +2356,10 @@ struct OrtEp { | |
| * | ||
| * The returned string should be a null-terminated, UTF-8 encoded string. ORT will copy it. | ||
| * | ||
| * A single string is stored per EP (not per device). If the EP may later be validated against multiple devices | ||
| * (e.g., multi-adapter or multi-GPU), serialize enough information here to evaluate compatibility against each such | ||
| * device individually. See ValidateCompiledModelCompatibilityInfo for how the per-device verdicts are combined. | ||
| * | ||
| * \param[in] this_ptr The OrtEp instance. | ||
| * \param[in] graph The OrtGraph instance for which to generate compatibility information. | ||
| * | ||
|
|
@@ -2795,16 +2799,49 @@ struct OrtEpFactory { | |
|
|
||
| /** \brief Validate the compatibility of a compiled model with the execution provider factory for one or more devices. | ||
| * | ||
| * Given a compatibility info string produced during model compilation, the EP factory should determine whether the | ||
| * compiled model is compatible with the EP factory when targeting the provided hardware devices. All devices provided | ||
| * must belong to the same execution provider instance that this factory creates. | ||
| * | ||
| * The EP factory implementation should consider the set of devices (e.g., multi-adapter or multi-GPU scenarios) when | ||
| * evaluating compatibility and set `model_compatibility` accordingly. | ||
| * Given a compatibility info string produced during model compilation (see OrtEp::GetCompiledModelCompatibilityInfo), | ||
| * the EP factory determines whether the compiled model is compatible with the EP factory when targeting the provided | ||
| * hardware devices, and reports a single OrtCompiledModelCompatibility verdict for the whole set. | ||
| * | ||
| * All devices provided belong to the same execution provider instance that this factory creates. The set represents | ||
| * the devices the EP would run the model on *together* (e.g., multi-adapter or multi-GPU scenarios), NOT a menu of | ||
| * candidate placements to choose the best one from. Because the function returns a single verdict for the entire set, | ||
| * that verdict must describe running on the set as a whole (a conjunction): if the model cannot run on one of the | ||
| * devices the EP would use, the model cannot run on the set. | ||
| * | ||
| * A single-device EP (the common case) may ignore `devices`/`num_devices` and validate `compatibility_info` against | ||
| * its own configuration. The per-device algorithm below is required only when the EP may run a model across more | ||
| * than one device at once. | ||
| * | ||
| * Required implementation when num_devices > 1 (a "best of any device" result is NOT permitted -- a single verdict | ||
| * cannot convey which device it applies to, so ORT would otherwise be told a model is runnable on a set that | ||
| * contains a device it cannot run on): | ||
|
Comment on lines
+2816
to
+2818
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't the user provide the same set of OrtEpDevice instances when running the model and that's a hard requirement? e.g. say the devices were CPU and NPU. model is optimal for NPU. it's up to the EP which devices it actually uses at runtime, so as long as I have the optimal device in that set why should I downgrade the rating due to extra devices as I am not forced to use those? |
||
| * 1. Compute a per-device verdict (e.g., by validating `compatibility_info` against each device individually). | ||
| * 2. Combine the per-device verdicts into one. Treat EP_NOT_APPLICABLE as a neutral value (skip it) and take the | ||
| * worst of the remaining verdicts: | ||
| * - if any device is EP_UNSUPPORTED -> EP_UNSUPPORTED | ||
| * - else if any device is EP_SUPPORTED_PREFER_RECOMPILATION -> EP_SUPPORTED_PREFER_RECOMPILATION | ||
| * - else if at least one device is EP_SUPPORTED_OPTIMAL -> EP_SUPPORTED_OPTIMAL | ||
| * - else (every device was EP_NOT_APPLICABLE) -> EP_NOT_APPLICABLE | ||
| * Equivalently: report EP_SUPPORTED_OPTIMAL only if every device the EP has an opinion on is optimal. | ||
| * | ||
| * Choosing the verdict for the "no opinion" and "bad artifact" cases (the compatibility string is opaque to ORT and | ||
| * is interpreted only by the EP that produced it): | ||
| * - EP_NOT_APPLICABLE: the EP has no opinion -- `compatibility_info` is empty or was clearly produced by a | ||
| * different EP. ORT treats this as "no compiled artifact for this EP": session creation proceeds, and in model | ||
| * package variant selection the variant remains eligible but at the lowest priority. | ||
| * - EP_UNSUPPORTED: the string appears to be this EP's but the compiled model cannot run on the target | ||
| * hardware/configuration. ORT rejects it (fails session creation / excludes the variant). Returning | ||
| * EP_NOT_APPLICABLE for a corrupt or stale artifact that is actually this EP's would let it pass silently. | ||
| * | ||
| * Note that a single string is stored per EP (not per device), so GetCompiledModelCompatibilityInfo should serialize | ||
| * enough information to evaluate the string against every device the EP may later be asked to validate against. | ||
| * | ||
| * \param[in] this_ptr The OrtEpFactory instance. | ||
| * \param[in] devices Array of OrtHardwareDevice pointers that the EP would run on. All must map to this EP. | ||
| * \param[in] num_devices Number of entries in `devices`. | ||
| * \param[in] num_devices Number of entries in `devices`. May be 0 when no device-specific context is available; | ||
| * in that case evaluate `compatibility_info` against the EP's own configuration and do NOT | ||
| * dereference `devices`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we ever call this with zero devices? OrtApis::GetModelCompatibilityForEpDevices requires at least one. |
||
| * \param[in] compatibility_info The compatibility information string produced when the model was compiled. | ||
| * \param[out] model_compatibility OrtCompiledModelCompatibility value describing the compatibility of the model with the EP. | ||
| * | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| // Licensed under the MIT License. | ||
|
|
||
| #pragma once | ||
|
|
||
| #include "core/session/onnxruntime_c_api.h" | ||
|
|
||
| namespace example_ep { | ||
|
|
||
| // Maps an OrtCompiledModelCompatibility value to an ordinal where a lower rank is a "worse" verdict | ||
| // (EP_UNSUPPORTED < EP_SUPPORTED_PREFER_RECOMPILATION < EP_SUPPORTED_OPTIMAL). EP_NOT_APPLICABLE is the identity | ||
| // element of the fold and is handled by CombineCompatibility before this is called, so it falls into the | ||
| // conservative default below. | ||
| inline int RankCompatibility(OrtCompiledModelCompatibility c) { | ||
| switch (c) { | ||
| case OrtCompiledModelCompatibility_EP_UNSUPPORTED: | ||
| return 0; | ||
| case OrtCompiledModelCompatibility_EP_SUPPORTED_PREFER_RECOMPILATION: | ||
| return 1; | ||
| case OrtCompiledModelCompatibility_EP_SUPPORTED_OPTIMAL: | ||
| return 2; | ||
| default: | ||
| // Conservative: treat any unknown/unhandled value as the worst rank so an unexpected value can never be | ||
| // reported as compatible. Update this switch if new OrtCompiledModelCompatibility values are added. | ||
| return 0; | ||
| } | ||
| } | ||
|
|
||
| // Combines two per-device verdicts following the rule documented for | ||
| // OrtEpFactory::ValidateCompiledModelCompatibilityInfo in onnxruntime_ep_c_api.h: EP_NOT_APPLICABLE is a neutral | ||
| // identity (skipped), and otherwise the worst verdict wins. This is the reduction an EP folds over its per-device | ||
| // verdicts to produce the single verdict the API must return. | ||
| inline OrtCompiledModelCompatibility CombineCompatibility(OrtCompiledModelCompatibility acc, | ||
| OrtCompiledModelCompatibility next) { | ||
| if (next == OrtCompiledModelCompatibility_EP_NOT_APPLICABLE) { | ||
| return acc; | ||
| } | ||
| if (acc == OrtCompiledModelCompatibility_EP_NOT_APPLICABLE) { | ||
| return next; | ||
| } | ||
|
|
||
| // Take the verdict with the lower rank, i.e. the worse of the two. | ||
| return RankCompatibility(next) < RankCompatibility(acc) ? next : acc; | ||
| } | ||
|
|
||
| } // namespace example_ep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a concrete example would help. e.g OV EP has OrtEpDevice instances for CPU, GPU and NPU. How many calls are we expecting to this API and with what combinations?