feat(cli): add local usage metrics#2493
Conversation
📝 WalkthroughWalkthroughAdds a local JSONL metrics subsystem at Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as "User (CLI)"
participant CLI as "nemoclaw CLI"
participant Metrics as "metrics module"
participant FS as "Filesystem (~/.nemoclaw/metrics.jsonl)"
User->>CLI: run command (e.g., onboard / <name> stats / stats --reset)
alt lifecycle command (onboard/connect/destroy/policy_apply)
CLI->>Metrics: recordMetricEvent(name, {status?, sandbox?, data?})
Metrics->>FS: ensure dir, open file, append JSONL line, set permissions (0o600)
FS-->>Metrics: write result (success/failure)
Metrics-->>CLI: return boolean (best-effort)
else stats command
CLI->>Metrics: readMetricEvents(filePath)
Metrics->>FS: read file lines
FS-->>Metrics: file contents
Metrics->>Metrics: parse lines, count malformed, summarize (optional sandbox filter)
Metrics-->>CLI: formatMetricSummary(summary, filePath)
CLI-->>User: print summary (or error on bad usage/reset failure)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Adds a local-only metrics system to the NemoClaw CLI to support lightweight usage observability (issue #30), including event recording to a JSONL file and new stats commands for viewing/resetting aggregates.
Changes:
- Introduce
src/lib/metrics.tsto record, read, summarize, and format local JSONL metric events in~/.nemoclaw/metrics.jsonl. - Add
nemoclaw stats,nemoclaw stats --reset, andnemoclaw <name> statscommands plus instrumentation for onboarding, connect, destroy, and policy apply. - Update command registry, docs, and tests to cover metrics behavior and CLI output.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/cli.test.ts | Adds CLI-level tests for stats, stats --reset, per-sandbox stats filtering, and onboarding lifecycle metric recording behavior. |
| src/nemoclaw.ts | Wires in metrics module, adds stats commands, and instruments onboard/connect/destroy/policy events. |
| src/lib/metrics.ts | New metrics implementation for JSONL append/read/summarize/format and reset. |
| src/lib/metrics.test.ts | Adds unit tests for metrics recording, summarization, malformed line handling, and reset. |
| src/lib/command-registry.ts | Registers global and sandbox stats commands and updates command count. |
| src/lib/command-registry.test.ts | Updates command counts/tokens expectations to include stats. |
| docs/reference/commands.md | Documents new stats commands and the local-only metrics file behavior. |
| .agents/skills/nemoclaw-user-reference/references/commands.md | Updates generated user reference docs with the new stats commands. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/nemoclaw.ts`:
- Around line 1656-1661: The current metrics.recordMetricEvent call for the
"sandbox_connect" event uses result.status === 0 to determine success, but that
status reflects the interactive shell's last command exit and not the connection
handshake; change to record raw exit details (e.g., include result.status and
result.signal fields) and determine success from a dedicated pre-shell handshake
or connection check instead of result.status. Update the call site where
metrics.recordMetricEvent("sandbox_connect", ...) is invoked to set status from
the handshake/connection boolean (or "unknown" if no handshake), and include
result.status and result.signal under a separate field like rawExit or exitInfo
so the connect metric isn't skewed by in-shell command failures.
- Around line 1953-1958: policies.applyPreset() can return false on failure, but
the code always logs metrics.recordMetricEvent("policy_apply", ...) with status
"success"; update the logic around applyPreset() so you capture its boolean
return and only record a "success" metric when applyPreset(...) returns true,
otherwise record a failure (or skip) metric similar to how removePreset() is
handled; use the same local variables (sandboxName, answer) and ensure the
metric's status reflects the applyPreset result before calling
metrics.recordMetricEvent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 923c4ac4-bc62-4528-b7f1-ee48222fa556
📒 Files selected for processing (8)
.agents/skills/nemoclaw-user-reference/references/commands.mddocs/reference/commands.mdsrc/lib/command-registry.test.tssrc/lib/command-registry.tssrc/lib/metrics.test.tssrc/lib/metrics.tssrc/nemoclaw.tstest/cli.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/metrics.ts`:
- Around line 49-53: The runtime guard is too permissive: update isMetricEvent
to ensure candidate.event is one of the known event names and candidate.status
(if present) matches the allowed statuses instead of accepting any string;
specifically introduce a VALID_EVENTS set (or enum) and a VALID_STATUSES set and
check that candidate.event is in VALID_EVENTS and (when candidate.status !==
undefined) candidate.status is in VALID_STATUSES, and also validate
candidate.time is a parsable ISO timestamp (e.g., Date.parse(candidate.time) is
not NaN) before returning true from isMetricEvent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 714d7088-8a0e-4c8f-a98b-7578efa054eb
📒 Files selected for processing (2)
src/lib/metrics.test.tssrc/lib/metrics.ts
a956a2c to
4f0cba6
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/lib/metrics.ts (1)
55-59:⚠️ Potential issue | 🟡 MinorTighten
isMetricEventso the type predicate is sound.On Line 58, any string passes for
event, andstatusis not validated. That allows non-MetricEventpayloads through as valid events.Proposed fix
+const METRIC_EVENT_NAMES = new Set<MetricEventName>([ + "onboard_start", + "onboard_complete", + "sandbox_connect", + "sandbox_destroy", + "policy_apply", +]); + +const METRIC_STATUSES = new Set<MetricStatus>(["success", "failed"]); + function isMetricEvent(value: unknown): value is MetricEvent { if (typeof value !== "object" || value === null) return false; const candidate = value as Partial<MetricEvent>; - return typeof candidate.time === "string" && typeof candidate.event === "string"; + if (typeof candidate.time !== "string") return false; + if (typeof candidate.event !== "string" || !METRIC_EVENT_NAMES.has(candidate.event as MetricEventName)) { + return false; + } + if ( + candidate.status !== undefined && + (typeof candidate.status !== "string" || !METRIC_STATUSES.has(candidate.status as MetricStatus)) + ) { + return false; + } + return true; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/metrics.ts` around lines 55 - 59, The type guard isMetricEvent currently only checks that time and event are strings, letting invalid payloads through and not validating status; update isMetricEvent to more strictly validate the shape: ensure candidate.time is an ISO-8601/parsable date string (e.g. Date.parse(candidate.time) is not NaN), ensure candidate.event is a non-empty string or — better — one of the canonical event names (compare against a known array/enum of allowed event names used elsewhere), and ensure candidate.status exists and is the expected type/value (e.g. typeof candidate.status === "string" and matches allowed status values or is non-empty). Use the MetricEvent symbol and any existing canonical event/status constants to perform these checks so the predicate is sound.
🧹 Nitpick comments (2)
src/nemoclaw.ts (2)
2509-2514: Consider recording metrics on destroy failure for completeness.The
sandbox_destroymetric is only recorded on success. When destroy fails (line 2479), the process exits without recording. For comprehensive metrics, you could also record failure events.🔧 Optional improvement at line 2479
if (deleteResult.status !== 0 && !alreadyGone) { if (deleteOutput) { console.error(` ${deleteOutput}`); } console.error(` Failed to destroy sandbox '${sandboxName}'.`); + metrics.recordMetricEvent("sandbox_destroy", { + sandbox: sandboxName, + command: "destroy", + status: "failed", + }); process.exit(deleteResult.status || 1); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 2509 - 2514, The sandbox_destroy metric is only emitted on success via metrics.recordMetricEvent("sandbox_destroy", { sandbox: sandboxName, command: "destroy", status: "success", data: { alreadyGone } }); — modify the error path that currently exits on destroy failure to also call metrics.recordMetricEvent("sandbox_destroy", ...) with status: "failure" (include sandboxName, command: "destroy", and data containing the error/exception and alreadyGone if available) before exiting or rethrowing so failures are recorded for completeness.
1246-1248: Consider checkingreadResult.successfor better user feedback.The code doesn't check
readResult.successbefore summarizing. If the metrics file can't be read (permissions, corruption), users see "0 events" instead of an error message. This is acceptable for best-effort design, but you could differentiate between "no metrics yet" and "read failure" with a simple check.🔧 Optional improvement
const readResult = metrics.readMetricEvents(metricsFile); + if (!readResult.success && fs.existsSync(metricsFile)) { + console.error(` Warning: Could not read metrics file: ${metricsFile}`); + } const summary = metrics.summarizeMetricEvents(readResult, sandboxName); console.log(metrics.formatMetricSummary(summary, metricsFile));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 1246 - 1248, The current flow calls metrics.summarizeMetricEvents and metrics.formatMetricSummary unconditionally on readResult; instead check readResult.success after calling metrics.readMetricEvents(metricsFile) and if false log or console.error a clear message including readResult.error (or reason) and avoid calling summarizeMetricEvents/formatMetricSummary; if success is true, pass readResult.data (or whatever payload) into metrics.summarizeMetricEvents(metricsFile, sandboxName) and then print the formatted summary. Ensure you reference readResult, metrics.readMetricEvents, metrics.summarizeMetricEvents, metrics.formatMetricSummary, metricsFile, and sandboxName when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/lib/metrics.ts`:
- Around line 55-59: The type guard isMetricEvent currently only checks that
time and event are strings, letting invalid payloads through and not validating
status; update isMetricEvent to more strictly validate the shape: ensure
candidate.time is an ISO-8601/parsable date string (e.g.
Date.parse(candidate.time) is not NaN), ensure candidate.event is a non-empty
string or — better — one of the canonical event names (compare against a known
array/enum of allowed event names used elsewhere), and ensure candidate.status
exists and is the expected type/value (e.g. typeof candidate.status === "string"
and matches allowed status values or is non-empty). Use the MetricEvent symbol
and any existing canonical event/status constants to perform these checks so the
predicate is sound.
---
Nitpick comments:
In `@src/nemoclaw.ts`:
- Around line 2509-2514: The sandbox_destroy metric is only emitted on success
via metrics.recordMetricEvent("sandbox_destroy", { sandbox: sandboxName,
command: "destroy", status: "success", data: { alreadyGone } }); — modify the
error path that currently exits on destroy failure to also call
metrics.recordMetricEvent("sandbox_destroy", ...) with status: "failure"
(include sandboxName, command: "destroy", and data containing the
error/exception and alreadyGone if available) before exiting or rethrowing so
failures are recorded for completeness.
- Around line 1246-1248: The current flow calls metrics.summarizeMetricEvents
and metrics.formatMetricSummary unconditionally on readResult; instead check
readResult.success after calling metrics.readMetricEvents(metricsFile) and if
false log or console.error a clear message including readResult.error (or
reason) and avoid calling summarizeMetricEvents/formatMetricSummary; if success
is true, pass readResult.data (or whatever payload) into
metrics.summarizeMetricEvents(metricsFile, sandboxName) and then print the
formatted summary. Ensure you reference readResult, metrics.readMetricEvents,
metrics.summarizeMetricEvents, metrics.formatMetricSummary, metricsFile, and
sandboxName when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 13d38137-0d17-4284-b5e2-73bdfbb09848
📒 Files selected for processing (9)
.agents/skills/nemoclaw-user-reference/references/commands.mddocs/reference/commands.mdsrc/lib/command-registry.test.tssrc/lib/command-registry.tssrc/lib/metrics.test.tssrc/lib/metrics.tssrc/nemoclaw.tstest/cli.test.tstest/policies.test.ts
✅ Files skipped from review due to trivial changes (1)
- .agents/skills/nemoclaw-user-reference/references/commands.md
🚧 Files skipped from review as they are similar to previous changes (2)
- src/lib/metrics.test.ts
- docs/reference/commands.md
4f0cba6 to
b4ed855
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/lib/metrics.ts (1)
65-80:⚠️ Potential issue | 🟡 MinorValidate timestamp format in
isMetricEvent.
timecurrently accepts any string, so invalid date strings are treated as valid events and can skewFirst/Lastsummary output.Proposed fix
function isMetricEvent(value: unknown): value is MetricEvent { if (typeof value !== "object" || value === null) return false; const candidate = value as Partial<MetricEvent>; - if (typeof candidate.time !== "string" || typeof candidate.event !== "string") { + if ( + typeof candidate.time !== "string" || + Number.isNaN(Date.parse(candidate.time)) || + typeof candidate.event !== "string" + ) { return false; } if (!METRIC_EVENT_NAMES.has(candidate.event)) { return false; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/metrics.ts` around lines 65 - 80, The type guard isMetricEvent currently accepts any string for candidate.time; update isMetricEvent (and the MetricEvent check) to validate that candidate.time is a valid timestamp (e.g., parse with Date or Date.parse and ensure resulting Date is valid / not NaN, or validate against an ISO‑8601 regex) in addition to checking it's a string, while keeping the existing checks for candidate.event against METRIC_EVENT_NAMES and candidate.status against METRIC_STATUSES; this ensures only events with valid timestamps are treated as MetricEvent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/lib/metrics.ts`:
- Around line 65-80: The type guard isMetricEvent currently accepts any string
for candidate.time; update isMetricEvent (and the MetricEvent check) to validate
that candidate.time is a valid timestamp (e.g., parse with Date or Date.parse
and ensure resulting Date is valid / not NaN, or validate against an ISO‑8601
regex) in addition to checking it's a string, while keeping the existing checks
for candidate.event against METRIC_EVENT_NAMES and candidate.status against
METRIC_STATUSES; this ensures only events with valid timestamps are treated as
MetricEvent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e9c9be37-f066-4104-b366-15667c0734b2
📒 Files selected for processing (9)
.agents/skills/nemoclaw-user-reference/references/commands.mddocs/reference/commands.mdsrc/lib/command-registry.test.tssrc/lib/command-registry.tssrc/lib/metrics.test.tssrc/lib/metrics.tssrc/nemoclaw.tstest/cli.test.tstest/policies.test.ts
✅ Files skipped from review due to trivial changes (2)
- test/policies.test.ts
- .agents/skills/nemoclaw-user-reference/references/commands.md
|
✨ Thanks for submitting this pull request that proposes a way to add lightweight local-only usage metrics for the NemoClaw CLI. Related open issues: |
Summary
Adds lightweight local-only usage metrics for the NemoClaw CLI.
The CLI now records key lifecycle events to
~/.nemoclaw/metrics.jsonland exposes aggregate and per-sandbox summaries throughnemoclaw stats.Related Issue
Closes #30.
Changes
src/lib/metrics.tsfor best-effort JSONL event recording, parsing, resetting, and summary formatting.onboard_start,onboard_complete,sandbox_connect,sandbox_destroy, andpolicy_applyevents from the CLI.nemoclaw stats,nemoclaw stats --reset, andnemoclaw <name> statsdispatch paths.src/lib/command-registry.ts.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Additional validation run locally:
npm run build:clipassesnpm test -- test/policies.test.ts src/lib/metrics.test.ts src/lib/command-registry.test.ts test/cli.test.tspassesnpm test -- test/credentials.test.tspassesnpm test -- test/runtime-shell.test.tspassesnpm run typecheck:clipassesgit diff --checkpassesNotes:
make checkpasses formatting, license, generated docs/skills, shell, lint, schema, secret scan, and markdown hooks locally.make checkdoes not complete in this local Codex session because the full CLI test hook hits unrelated local timeout/IPC issues in tests outside this PR's diff. The focused metrics/CLI tests and the affected timeout tests pass when run directly.AI Disclosure
Signed-off-by: Ho Lim subhoya@gmail.com
Summary by CodeRabbit
New Features
nemoclaw <name> statsand globalnemoclaw stats(with--reset) to view local CLI metrics; CLI now emits metrics for onboarding, sandbox connect/destroy, and policy apply.Improvements
Documentation
Tests