Hnimrama/ix atom by hnimra-amd · Pull Request #238 · ROCm/cvs

hnimra-amd · 2026-06-23T23:27:31Z

PR #238 — InferenceX ATOM W1 (MI300X perf gates)

PR: #238
Head: hnimrama/IX-atom → Base: dev/dtni

Motivation

CVS needs a first-class InferenceX ATOM automation path aligned with the DTNI Validation Tracker (IX ATOM) — not the legacy inferencemax_single uplift. This PR:

Introduces the inferencex_atom_single suite with an ATOM-native driver (atom.entrypoints.openai_server + atom.benchmarks.benchmark_serving).
Ships W1 DeepSeek R1 FP8 variant configs for MI300X and MI355X (perf, smoke, MTP3) with calibrated / CI-seeded thresholds.
Closes M1 / Phase A on MI300X: W1 perf with enforce_thresholds: true after lab confirmation.
Documents the IX-atom roadmap (W1–W18, accuracy, metric tiers, parity frameworks) in plans/inferencex-atom-cvs-automation-plan.md.

MI355X variant configs ship with enforce_thresholds: false until hardware is available — they do not block merge or MI300X milestone work.

Base branch: dev/dtni.

Technical Details

Suite and orchestration

Rename inferencemax_single → inferencex_atom_single; legacy inferencemax/ configs removed.
New InferenceXAtomJob (inferencex_atom_orch.py):
- params.driver=atom → ATOM server + benchmark_serving; JSON artifacts parsed via to_client_metrics.
- params.driver=vllm retained for interim GPT-OSS uplift variants only.
Canonical config layout: flat sibling pairs under cvs/input/config_file/inference/inferencex_atom_single/:
- {gpu}_inferencex-atom-single_{model}_{precision}[_{mode}]_config.json
- {gpu}_inferencex-atom-single_{model}_{precision}[_{mode}]_threshold.json
schema_version: 1, typed loader, and ix_recipes.json recipe pins (dsr1-fp8-mi300x-atom, etc.).
Cluster examples: mi300x_atom_single.json, mi355x_atom_single.json; container names pinned (inferencex_atom_mi300x / inferencex_atom_mi355x).
Shared suite helpers: inference_suite_lifecycle.py, inference_suite_results_table.py.

W1 variants shipped

Variant stem	Arch	`enforce_thresholds`	Notes
`mi300x_inferencex-atom-single_deepseek-r1_fp8_perf`	MI300X	`true`	M1 gate — ISL=OSL=1024, CONC=128/256, 1000 prompts
`mi300x_inferencex-atom-single_deepseek-r1_fp8_smoke`	MI300X	`false`	128-prompt pre-gate
`mi300x_inferencex-atom-single_deepseek-r1_fp8_mtp3`	MI300X	`false`	MTP3 recipe
`mi355x_inferencex-atom-single_deepseek-r1_fp8_perf`	MI355X	`false`	CI-seeded thresholds (plan Section 4.3)
`mi355x_inferencex-atom-single_deepseek-r1_fp8_mtp3`	MI355X	`false`	CI-seeded

Threshold / metrics plumbing

Tiered gates via test_cell_metrics (METRIC_TIERS: throughput, ttft, tpot, health, record) — one pytest row per tier per sweep cell.
to_client_metrics: derive client.failed when ATOM omits it (num_prompts - completed), then compute client.success_rate; add client.output_tput_per_gpu.
Tpot tier gates p99_tpot_ms (ATOM benchmark_serving emits p99 tails; p95_tpot_ms may be absent even with metric_percentiles: "95,99"). Tier enforcement skips metrics missing from the artifact.
MI300X W1 perf thresholds recalibrated from 2026-06-25 lab run (throughput mins = measured × 0.9; latency maxes = measured × 1.1).

Platform / shared infra (supporting changes)

cvs/lib/utils/ — shared config_loader, verdict, sweep selector.
vllm_single suite refactor to the same lifecycle / metric pattern.
Runbook: cvs/input/config_file/inference/inferencex_atom_single/README.md.

Out of scope (follow-up on `dev/dtni`)

M2 gsm8k accuracy
M3 P1 workloads (W2, W3, W13, W17)
M4 inferencex_atom_vllm_single / inferencex_atom_sglang_single parity frameworks
M5 multi-node (prioritized immediately after M4 parity)

Test Plan

CI / unit (no GPU)

pytest cvs/tests/inference/inferencex_atom/ -q
pytest cvs/lib/inference/unittests/test_inferencex_atom_parsing.py -q
pytest cvs/lib/inference/unittests/test_inferencex_atom_config_loader.py -q
Config loader / sweep selector / orch parse tests pass

Lab — MI300X smoke

Launcher: CTR-SVDT-L005 (10.7.54.167) · GPU node: 10.245.135.75

cd ~/cvs && make install && source .cvs_venv/bin/activate

SMOKE_DIR=~/input/config_file/inference/inferencex_atom_single/smoke
mkdir -p "$SMOKE_DIR"
cvs copy-config inference/inferencex_atom_single/mi300x_inferencex-atom-single_deepseek-r1_fp8_smoke_config.json \
  --output "$SMOKE_DIR/mi300x_inferencex-atom-single_deepseek-r1_fp8_smoke_config.json"
cvs copy-config inference/inferencex_atom_single/mi300x_inferencex-atom-single_deepseek-r1_fp8_smoke_threshold.json \
  --output "$SMOKE_DIR/mi300x_inferencex-atom-single_deepseek-r1_fp8_smoke_threshold.json"

TS=$(date +%Y%m%d_%H%M%S)
HTML="$HOME/cvs_results/${TS}_ix-atom-smoke_mi300x.html"
LOG="$HOME/cvs_results/${TS}_ix-atom-smoke_mi300x.log"

cvs run inferencex_atom_single \
  --cluster_file ~/input/cluster_file/mi300x_atom_single.json \
  --config_file "$SMOKE_DIR/mi300x_inferencex-atom-single_deepseek-r1_fp8_smoke_config.json" \
  --html="$HTML" --self-contained-html --log-file="$LOG" -vvv -s

11 passed, 0 failed (~13 min)

Lab — MI300X W1 perf (M1 gate)

cd ~/cvs && make install && source .cvs_venv/bin/activate

PERF_DIR=~/input/config_file/inference/inferencex_atom_single/perf
mkdir -p "$PERF_DIR"
cvs copy-config inference/inferencex_atom_single/mi300x_inferencex-atom-single_deepseek-r1_fp8_perf_config.json \
  --output "$PERF_DIR/mi300x_inferencex-atom-single_deepseek-r1_fp8_perf_config.json"
cvs copy-config inference/inferencex_atom_single/mi300x_inferencex-atom-single_deepseek-r1_fp8_perf_threshold.json \
  --output "$PERF_DIR/mi300x_inferencex-atom-single_deepseek-r1_fp8_perf_threshold.json"

TS=$(date +%Y%m%d_%H%M%S)
HTML="$HOME/cvs_results/${TS}_ix-atom-w1-perf_mi300x.html"
LOG="$HOME/cvs_results/${TS}_ix-atom-w1-perf_mi300x.log"

cvs run inferencex_atom_single \
  --cluster_file ~/input/cluster_file/mi300x_atom_single.json \
  --config_file "$PERF_DIR/mi300x_inferencex-atom-single_deepseek-r1_fp8_perf_config.json" \
  --html="$HTML" --self-contained-html --log-file="$LOG" -vvv -s

Lifecycle stages pass (container, model fetch, server start, benchmark client, teardown).
Both sweep cells: CONC=128, CONC=256 (ISL=OSL=1024, TP=8); server reused across cells.
All metric tiers pass under enforce_thresholds: true.
HTML report attached to PR.

MI355X

Not required for merge — configs ship; flip enforce_thresholds when hardware is available.

Test Result

MI300X lab — `mi300x_inferencex-atom-single_deepseek-r1_fp8_perf`

Branch: hnimrama/IX-atom @ 07c90a7
Launcher: CTR-SVDT-L005 (10.7.54.167) · GPU node: 10.245.135.75
Install: make install from repo root (.cvs_venv)
Image: rocm/atom-dev:latest
Model: deepseek-ai/DeepSeek-R1-0528 (FP8, TP=8)
Outcome: 17 passed, 0 failed (~22 min)

Cell	`client.output_throughput` (measured)	Threshold (min)	Result
CONC=128	2867 tok/s	2580 tok/s	PASS
CONC=256	4697 tok/s	4227 tok/s	PASS

Cell	TTFT / TPOT gates	Result
CONC=128	mean TTFT 811 ms (max 892); p99 TTFT 6511 ms (max 7162); mean TPOT 42.5 ms; p99 TPOT 46.7 ms (max 51.4)	PASS
CONC=256	mean TTFT 728 ms; p99 TPOT 59.7 ms (max 65.6)	PASS

MI300X lab — smoke

Outcome: 11 passed, 0 failed (~13 min)

Artifacts (attach to PR)

inferencex_atom_single_2026-06-25T175704.zip

Unit tests

CI / local unit suite green on PR branch.

Submission Checklist

PR open: #238
Jira linked: AIMVT-236, AIMVT-244
Base branch is dev/dtni
Lab run used make install from this branch (not a stale site-packages install)
MI300X W1 perf + smoke HTML reports attached
enforce_thresholds: true only on MI300X W1 perf — confirmed in lab
MI355X variants left at enforce_thresholds: false (pending hardware)
Plan doc reviewed for milestone alignment
Reviewer aware M2 (gsm8k), M3 (W2/W3/W13/W17), and M5 (multi-node) are follow-ups on dev/dtni

Stacked PR body covering motivation, technical details, test plan, lab results, and checklist targeting dev/dtni.

Reverts commit 4a8425f, restoring the changes from PR #225 on dev/dtni.

Probe python3.13..python3 for import vllm; export BENCH_PY and BENCH_SCRIPT. Use shlex.quote for docker exec bash -c. Align InferenceMax client completion with Serving Benchmark Result or End-to-end Latency.

Search site-packages and ancestor paths, verify the file is readable, and document vllm[bench] when wheels omit benchmarks/.

Use CVS_GPU_MEMORY_UTIL in sample config and serve script to avoid vLLM unknown-env warnings. Extend default readiness poll budget to 60 and grep full server logs so Uvicorn ready is not missed after long model loads.

Wheels often omit vllm/benchmarks; resolve the driver via eval exports, run python -m vllm.entrypoints.cli.main bench serve when needed, and fail fast on missing-script log patterns in InferenceMax and base polling.

vLLM random workloads scale (ISL+OSL)*(1+r); clamp ratio when it would exceed MML, pass --temperature 0 for greedy parity, and forward --metric-percentiles in InferenceMax and vllm_single clients.

Read client_poll_count and client_poll_wait_time from benchmark_params (defaults 50/60), document them and fix the inferencemax.rst table, and surface the keys in sample MI300X/MI355X configs.

…st polling Gate benchmark success on Failed requests only after the summary is present; tail more client log lines for InferenceMax. Variant and benchmark_params accept bench_max_failed_requests (default 0 remains strict for CI).

Move InferenceMax loading onto substitute_config and a typed InferenceMaxVariantConfig with legacy adapters for InferenceMaxJob until the driver is ported.

…ase 2) Flatten MI300X and MI355X variant configs to paths/model/container/roles/params/sweep and client.* threshold specs with enforce_thresholds false until recalibrated.

…Phase 2) Use variant_config and legacy adapter fixtures, parametrization from sweep.runs, and unit tests for load_variant and threshold adapters.

…ion 1 (Phase 2) Point loader and threshold docs at inferencemax_config_loader.load_variant and the client.* sweep cell format.

Point run-cvs-tests and dtni-dev-guide at cvs.lib.utils and inference/utils loaders.

Standalone driver uses Python-built vllm serve, vllm bench serve, and artifact parsing. Drop legacy InferenceBaseJob path and factory construction.

…_args (Phase 3) MI300X and MI355X variants drop host-script and bench_serving params in favor of Python serve args.

… (Phase 3) Add model_fetch, test_metric, and new InferenceMaxJob lifecycle. Update conftest and unit tests for typed config.

…ase 3) Document Python serve, client.* metrics, and expanded lifecycle test stages.

Host script staging was dropped when InferenceMaxJob moved to Python-built vllm serve.

InferenceMax and vllm_single build vllm serve in Python; this package remains for InferenceBaseJob paths.

…(Phase 5) Replace legacy config/benchmark_params table with typed blocks and client.* thresholds. Document inferencemax_config_loader in AGENTS.md.

Verify stock results artifact maps to client.* metrics via FakeOrch.

…ngle Adopt InferenceX ATOM as the framework identity while the suite is still internal. Renames the driver, config loader, pytest suite, variant configs, and documentation to inferencex_atom_single.

…aits

…oke waits

…ests

…ates

Document per-variant ~/input subdirs to avoid ambiguous threshold discovery, remote launcher vs GPU node prerequisites, and ~/cvs_results output paths.

Elevate scaling to P1 milestone M5 immediately after M4 parity when hardware and suite recipes support nnodes>1; defer MTP+P2 widen to M6.

Gate p99_tpot_ms instead of absent p95_tpot_ms, skip missing tier metrics in actuals, and recalibrate MI300X perf thresholds from the 2026-06-25 lab run.

Replace per-node calibrated gates with conservative throughput floors and loose latency caps so healthy runs pass across lab nodes without recalibration.

atnair-amd

Automated review from five-pass analysis (structure · duplication · unit tests · code quality · live validation run).

Blockers (must fix before merge): false-green on missing config_file · parse_results error paths untested · _client_log_failures untested
Majors: driver default guard · config/threshold structure diverges from vllm · server reuse helpers untested · build_server_cmd suppression untested · _merged_serve_args untested · tier-explosion untested · reuse_server_across_sweep default · no ATOM early-failure detection · wheel not in shared venv
Minors/NITs: see inline comments

No changes are requested without reviewer approval — comments only.

…aths

…n defaults

…uild

amd-droy

looks good to me. thanks @hnimra-amd

atnair-amd

lgtm @hnimra-amd

hnimra-amd marked this pull request as draft June 23, 2026 23:28

hnimra-amd mentioned this pull request Jun 24, 2026

Hnimrama/inferencemax uplift restore #229

Open

1 task

hnimra-amd marked this pull request as ready for review June 24, 2026 16:28

hnimra-amd requested review from amd-droy, anujmittal-amd and atnair-amd June 24, 2026 16:28

hnimra-amd added a commit that referenced this pull request Jun 24, 2026

docs: add PR #238 description draft for IX-atom M1 merge.

773ed14

Stacked PR body covering motivation, technical details, test plan, lab results, and checklist targeting dev/dtni.

hnimra-amd force-pushed the hnimrama/IX-atom branch from d62122b to e17ab7f Compare June 25, 2026 17:30

hnimra-amd added 22 commits June 25, 2026 13:09

Restore InferenceMax uplift reverted by #228

34487f9

Reverts commit 4a8425f, restoring the changes from PR #225 on dev/dtni.

fix(inference): run vLLM bench client with vLLM interpreter

dfe6ac4

Probe python3.13..python3 for import vllm; export BENCH_PY and BENCH_SCRIPT. Use shlex.quote for docker exec bash -c. Align InferenceMax client completion with Serving Benchmark Result or End-to-end Latency.

fix(dtni): broaden vLLM benchmark script discovery

870eac2

Search site-packages and ancestor paths, verify the file is readable, and document vllm[bench] when wheels omit benchmarks/.

fix(inference): harden InferenceMax server startup and GPU mem env

e933d5e

Use CVS_GPU_MEMORY_UTIL in sample config and serve script to avoid vLLM unknown-env warnings. Extend default readiness poll budget to 60 and grep full server logs so Uvicorn ready is not missed after long model loads.

fix(dtni): clamp bench random-range to max_model_length

8e9ba40

vLLM random workloads scale (ISL+OSL)*(1+r); clamp ratio when it would exceed MML, pass --temperature 0 for greedy parity, and forward --metric-percentiles in InferenceMax and vllm_single clients.

fix(inference): extend InferenceMax bench client poll budget

e2f6aeb

Read client_poll_count and client_poll_wait_time from benchmark_params (defaults 50/60), document them and fix the inferencemax.rst table, and surface the keys in sample MI300X/MI355X configs.

feat(inference): add typed InferenceMax config loader (Phase 1)

f32ba0a

Move InferenceMax loading onto substitute_config and a typed InferenceMaxVariantConfig with legacy adapters for InferenceMaxJob until the driver is ported.

feat(inference): migrate InferenceMax configs to schema_version 1 (Ph…

c1c479b

…ase 2) Flatten MI300X and MI355X variant configs to paths/model/container/roles/params/sweep and client.* threshold specs with enforce_thresholds false until recalibrated.

test(inference): wire inferencemax_single to typed config and sweep (…

903c071

…Phase 2) Use variant_config and legacy adapter fixtures, parametrization from sweep.runs, and unit tests for load_variant and threshold adapters.

docs(inference): update InferenceMax config reference for schema_vers…

8975eb8

…ion 1 (Phase 2) Point loader and threshold docs at inferencemax_config_loader.load_variant and the client.* sweep cell format.

docs: fix stale dtni.config_loader references (Phase 1 tail)

0269579

Point run-cvs-tests and dtni-dev-guide at cvs.lib.utils and inference/utils loaders.

feat(inference): rewrite InferenceMaxJob like VllmJob (Phase 3)

35542b9

Standalone driver uses Python-built vllm serve, vllm bench serve, and artifact parsing. Drop legacy InferenceBaseJob path and factory construction.

feat(inference): move InferenceMax server flags to roles.server.serve…

73171d6

…_args (Phase 3) MI300X and MI355X variants drop host-script and bench_serving params in favor of Python serve args.

test(inference): align inferencemax_single suite with VllmJob pattern…

c310154

… (Phase 3) Add model_fetch, test_metric, and new InferenceMaxJob lifecycle. Update conftest and unit tests for typed config.

docs(inference): update InferenceMax reference for Phase 3 driver (Ph…

d036290

…ase 3) Document Python serve, client.* metrics, and expanded lifecycle test stages.

chore(inference): remove unused inferencemax_host_scripts (Phase 5)

5ed8583

Host script staging was dropped when InferenceMaxJob moved to Python-built vllm serve.

docs: clarify vllm_benchmark_scripts are legacy-only (Phase 5)

db72ca8

InferenceMax and vllm_single build vllm serve in Python; this package remains for InferenceBaseJob paths.

docs(inference): rewrite InferenceMax reference for schema_version 1 …

13b8f10

…(Phase 5) Replace legacy config/benchmark_params table with typed blocks and client.* thresholds. Document inferencemax_config_loader in AGENTS.md.

test(inference): add InferenceMaxJob parse_results unit test (Phase 5)

414044c

Verify stock results artifact maps to client.* metrics via FakeOrch.

refactor(inference): rename inferencemax_single to inferencex_atom_si…

9b00085

…ngle Adopt InferenceX ATOM as the framework identity while the suite is still internal. Renames the driver, config loader, pytest suite, variant configs, and documentation to inferencex_atom_single.

hnimra-amd added 13 commits June 25, 2026 13:09

feat(inferencex): add W1 metric tiers for tiered threshold gates

5b22b78

feat(inferencex): reuse server across sweep cells and config-driven w…

d37b81c

…aits

feat(inferencex): replace per-metric tests with tiered test_cell_metrics

721df2f

chore(inferencex): enable server reuse on perf configs and shorter sm…

c1c4f80

…oke waits

fix(inferencex): align cluster container names with variant configs

af8435b

fix(inferencex): tighten W1 perf health gates when enforcing thresholds

e3d11d9

test(inferencex): add GATED_METRICS parity and health gate coverage t…

c5ecfe3

…ests

test(inferencex): add GATED_METRICS parity and health gate coverage t…

e310d6e

…ests

docs(inferencex): update plan and README for flat layout and tiered g…

47f8eb0

…ates

docs(inferencex): update plan and README for flat layout and tiered g…

c334aae

…ates

chore(inferencex): trim expand_sweep docstring

d742557

docs(inferencex): clarify lab layout, launcher host, and results paths

9d4f33a

Document per-variant ~/input subdirs to avoid ambiguous threshold discovery, remote launcher vs GPU node prerequisites, and ~/cvs_results output paths.

docs(plan): prioritize multi-node as M5 after framework parity

73dfb65

Elevate scaling to P1 milestone M5 immediately after M4 parity when hardware and suite recipes support nnodes>1; defer MTP+P2 widen to M6.

hnimra-amd force-pushed the hnimrama/IX-atom branch from 544f8ad to 73dfb65 Compare June 25, 2026 20:15

hnimra-amd added 2 commits June 25, 2026 13:29

fix(inferencex): align W1 tpot tier with ATOM bench output

07c90a7

Gate p99_tpot_ms instead of absent p95_tpot_ms, skip missing tier metrics in actuals, and recalibrate MI300X perf thresholds from the 2026-06-25 lab run.

chore(inferencex): use portable W1 perf thresholds on MI300X

0663460

Replace per-node calibrated gates with conservative throughput floors and loose latency caps so healthy runs pass across lab nodes without recalibration.

atnair-amd reviewed Jun 25, 2026

View reviewed changes

amd-droy reviewed Jun 26, 2026

View reviewed changes

Comment thread cvs/lib/inference/utils/vllm_benchmark_scripts/__init__.py

Comment thread cvs/lib/inference/utils/inferencex_atom_parsing.py

Comment thread cvs/lib/inference/inference_suite_lifecycle.py Outdated

Comment thread cvs/lib/inference/inference_suite_lifecycle.py Outdated

hnimra-amd added 7 commits June 27, 2026 18:26

test(inferencex): cover parse_results errors and client log failure p…

e6f2ca9

…aths

fix(inferencex): detect ATOM server early failures during wait_ready

7e42f9f

refactor(inferencex): extract sweep reuse helpers and safer collectio…

c79b776

…n defaults

feat(inferencex): add explicit threshold_json paths to variant configs

4326590

chore(inferencex): polish conftest docs and simplify CLIENT_METRICS b…

0018e02

…uild

refactor(inferencex): inline atom_args and remove ix_recipe indirection

eea293e

docs(plan): sync IX atom plan with inline atom_args config layout

7bddb93

atnair-amd reviewed Jun 29, 2026

View reviewed changes

Comment thread cvs/lib/inference/utils/vllm_benchmark_scripts/vllm_serve_mi300x.sh

amd-droy approved these changes Jun 29, 2026

View reviewed changes

refactor(inference): move vllm_benchmark_scripts to inference/utils

25691bb

atnair-amd approved these changes Jun 29, 2026

View reviewed changes

hnimra-amd merged commit 85b6c72 into dev/dtni Jun 29, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hnimrama/ix atom#238

Hnimrama/ix atom#238
hnimra-amd merged 68 commits into
dev/dtnifrom
hnimrama/IX-atom

hnimra-amd commented Jun 23, 2026 •

edited

Loading

Uh oh!

atnair-amd left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amd-droy left a comment

Uh oh!

atnair-amd left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

hnimra-amd commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #238 — InferenceX ATOM W1 (MI300X perf gates)

Motivation

Technical Details

Suite and orchestration

W1 variants shipped

Threshold / metrics plumbing

Platform / shared infra (supporting changes)

Out of scope (follow-up on dev/dtni)

Test Plan

CI / unit (no GPU)

Lab — MI300X smoke

Lab — MI300X W1 perf (M1 gate)

MI355X

Test Result

MI300X lab — mi300x_inferencex-atom-single_deepseek-r1_fp8_perf

MI300X lab — smoke

Artifacts (attach to PR)

Unit tests

Submission Checklist

Uh oh!

atnair-amd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amd-droy left a comment

Choose a reason for hiding this comment

Uh oh!

atnair-amd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hnimra-amd commented Jun 23, 2026 •

edited

Loading

Out of scope (follow-up on `dev/dtni`)

MI300X lab — `mi300x_inferencex-atom-single_deepseek-r1_fp8_perf`