fix(metrix): fix MCP list_available_metrics and shell quoting#115
fix(metrix): fix MCP list_available_metrics and shell quoting#115
Conversation
1. list_available_metrics returned hardcoded metric names that don't exist in the catalog (memory.l2_cache_hit_rate, compute.cu_utilization, compute.wave_occupancy). Now queries METRIC_CATALOG dynamically and groups by category. 2. rocprof_wrapper used str.split() to tokenize the profiled command, which breaks quoted arguments like `python -c "import torch; ..."`. Now uses shlex.split() to preserve shell quoting semantics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes two MCP-facing issues in the Metrix profiler tooling: incorrect metric discovery from the MCP server and incorrect command tokenization when profiling commands include quoted arguments.
Changes:
- Switch
rocprofv3command tokenization fromstr.split()toshlex.split()to preserve quoted arguments. - Update MCP
list_available_metricsto build its output dynamically and group metrics by category.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
metrix/src/metrix/profiler/rocprof_wrapper.py |
Improves robustness of profiling command argument parsing by respecting shell-style quoting. |
metrix/src/metrix/mcp/server.py |
Reworks metric listing to be dynamic and categorized for MCP clients. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # e.g. python -c "import torch; ..." won't be split inside quotes) | ||
| prof_cmd.append("--") | ||
| prof_cmd.extend(command.split()) | ||
| prof_cmd.extend(shlex.split(command)) |
There was a problem hiding this comment.
shlex.split(command) can raise ValueError (e.g., unmatched quotes). Right now that exception will bubble up and likely crash callers; consider catching ValueError here and raising a RuntimeError with an actionable message that includes the original command string.
| prof_cmd.extend(shlex.split(command)) | |
| try: | |
| prof_cmd.extend(shlex.split(command)) | |
| except ValueError as exc: | |
| raise RuntimeError( | |
| f"Failed to parse command for rocprofv3: {command!r}. " | |
| "Please check for unmatched quotes or other shell syntax issues." | |
| ) from exc |
| # Add target command (use shlex.split to preserve quoted arguments, | ||
| # e.g. python -c "import torch; ..." won't be split inside quotes) | ||
| prof_cmd.append("--") | ||
| prof_cmd.extend(command.split()) | ||
| prof_cmd.extend(shlex.split(command)) |
There was a problem hiding this comment.
This change modifies how the target command is tokenized (from str.split() to shlex.split()), but there doesn’t appear to be a unit test asserting correct behavior for quoted/escaped arguments. Adding a test that captures the constructed subprocess.run argv for a command like python -c "print(1)" would prevent regressions.
metrix/src/metrix/mcp/server.py
Outdated
| metrics = sorted(METRIC_CATALOG.keys()) | ||
|
|
||
| # Group by category for better discoverability | ||
| by_category = {} | ||
| for name in metrics: | ||
| cat = METRIC_CATALOG[name]["category"].value |
There was a problem hiding this comment.
list_available_metrics() is still not listing the actually supported metrics for the current backend/architecture. It returns METRIC_CATALOG.keys(), but some metrics exist in backend YAML (e.g. compute.gpu_utilization in backends/counter_defs.yaml) and are therefore profileable while not present in METRIC_CATALOG, so they will be missing from this list. Consider sourcing metrics from Metrix().backend.get_available_metrics() (or Metrix().list_metrics()) and then grouping by category using METRIC_CATALOG when metadata exists, with a fallback category (e.g., prefix before .) for metrics not in the catalog.
| metrics = sorted(METRIC_CATALOG.keys()) | |
| # Group by category for better discoverability | |
| by_category = {} | |
| for name in metrics: | |
| cat = METRIC_CATALOG[name]["category"].value | |
| profiler = Metrix() | |
| if hasattr(profiler, "list_metrics"): | |
| metrics = sorted(profiler.list_metrics()) | |
| elif hasattr(profiler, "backend") and hasattr(profiler.backend, "get_available_metrics"): | |
| metrics = sorted(profiler.backend.get_available_metrics()) | |
| else: | |
| metrics = sorted(METRIC_CATALOG.keys()) | |
| # Group by category for better discoverability | |
| by_category = {} | |
| for name in metrics: | |
| metric_meta = METRIC_CATALOG.get(name, {}) | |
| category = metric_meta.get("category") | |
| if category is not None: | |
| cat = getattr(category, "value", category) | |
| else: | |
| cat = name.split(".", 1)[0] if "." in name else "other" |
- test_mcp_server.py: 6 tests verifying list_available_metrics returns only metrics from METRIC_CATALOG, includes known good names, and rejects the old hardcoded wrong names (l2_cache_hit_rate, cu_utilization, wave_occupancy). - test_rocprof_wrapper.py: 3 tests verifying shlex.split preserves quoted arguments in profiled commands (double quotes, single quotes, and simple unquoted commands). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wrap shlex.split() in try/except ValueError -> RuntimeError with actionable message including the original command string - list_available_metrics now queries Metrix().list_metrics() for backend-aware metrics (includes YAML-defined metrics), with fallback to METRIC_CATALOG when no GPU is available - Add test for unmatched quotes raising RuntimeError - Skip GPU-dependent test when hipcc/ROCm is not available - Use >= instead of == for catalog size assertion (backend may have YAML-only metrics) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The SKILL.md examples showed `metrix ./my_app` but the actual CLI requires `metrix profile ./my_app`. This caused LLM agents to hit a parse error on first attempt and waste 2 round-trips recovering via `--help`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add gfx950 (MI350, CDNA 4) to the supported devices list - Remove "Use --device <arch>" hint that pointed to a non-existent flag, which caused LLM agents to hallucinate the parameter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add gfx1201 (RX 9070) and gfx1151 (RX 9060) which already have backends implemented. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both share the same gfx950 backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- profile_metrics: detailed docstring describing args, return format, and shell quoting behavior. Default to collecting all available metrics when none are specified instead of just one. - list_available_metrics: list categories in docstring, describe return structure. - SKILL.md: mention both MCP tools in the execution path section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
list_available_metricsreturned hardcoded wrong metric names: The MCP tool returnedmemory.l2_cache_hit_rate,compute.cu_utilization, andcompute.wave_occupancy— none of which exist in the actualMETRIC_CATALOG. The real metric ismemory.l2_hit_rate, and there are nocu_utilizationorwave_occupancymetrics. Now dynamically queriesMETRIC_CATALOGand groups results by category.Shell quoting broke profiling commands with quoted arguments:
rocprof_wrapper.pyusedstr.split()to tokenize the command string, which splits inside quotes. A command likepython -c "import torch; print(torch.cuda.is_available())"would be split into["python", "-c", "\"import", "torch;", ...], causingSyntaxError: unterminated string literalin the spawned Python process. Now usesshlex.split()which respects shell quoting semantics.Context
Found during a benchmark comparing 4 methods of GPU profiling tool integration (Vanilla, MCP, Skill, Native IntelliKit). The MCP config hit both bugs:
list_available_metricstold the LLM agent to use non-existent metrics (causing 2 failedprofile_metricscalls), andprofile_metricsbroke when the agent passedpython -c "..."commands (causing 2 more failures). The agent had to fall back to writing a wrapper script before profiling succeeded.Test plan
list_available_metricsreturns only metric names fromMETRIC_CATALOG(e.g.memory.l2_hit_rate, notmemory.l2_cache_hit_rate)profile_metricswithcommand="python -c 'import torch; print(1+1)'"correctly passes the quoted argument to rocprofv3./app,python script.py) still works🤖 Generated with Claude Code