ATOF v0.1: Agentic Trajectory Observability Format (aligned spec)#1890
ATOF v0.1: Agentic Trajectory Observability Format (aligned spec)#1890bbednarski9 wants to merge 70 commits intoNVIDIA:developfrom
Conversation
- Delete packages/nvidia_nat_atif/src/nat/atof/attributes.py per D-15 (consolidated into flags.py in the previous commit) - No deprecation shim, alias, or warnings.warn wrapper per D-01 and D-15 (pre-release 0.x versioning authorizes the clean break) - Three stale imports in __init__.py (`from nat.atof.attributes import ScopeAttributes / LLMAttributes / ToolAttributes`) now fail at import time — that is expected and will be resolved by Plan 05 (__init__.py rewrite, Wave 3 of Phase 7) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
- Create src/nat/atof/profiles.py with three BaseModel classes - ToolProfile(tool_call_id: str | None) with ConfigDict(extra="forbid") per D-05 - LLMProfile(model_name: str | None) with ConfigDict(extra="forbid") per D-05 - CustomProfile(empty body) with ConfigDict(extra="allow") per D-06 for vendor-namespaced round-trip per spec §5.7 These classes will back the typed discriminated union on ScopeStartEvent.profile / ScopeEndEvent.profile in Plan 07-03. Closes D-05, D-06, D-08. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Collapse seven event classes (ScopeStart/End, LLMStart/End, ToolStart/End, Mark) into three classes (ScopeStart, ScopeEnd, Mark) per spec v0.1. Move subject- specific fields (model_name, tool_call_id) into a typed ``profile`` sub-schema discriminated by ``scope_type``. This is a breaking rewrite authorized by pre-release 0.x versioning (D-01 -- no shims, no aliases, no warnings.warn). - Add required ``schema_version: str = "0.1"`` on _EventBase with a ``^0\.\d+$`` forward-compat validator (spec §5.7) - Change ``timestamp: str`` -> ``timestamp: str | int`` polymorphic per spec §5.1 (no eager normalization; round-trip fidelity preserved) - Add ``.ts_micros`` computed property normalizing either form to int microseconds UTC (the universal sort key) - Rename ``attributes`` -> ``flags`` on ScopeStart/ScopeEnd; no alias. A ``@model_validator(mode="before")`` on _EventBase rejects any wire payload containing ``attributes`` with a clear migration message. - Rename helper ``_canonicalize_attributes`` -> ``_canonicalize_flags`` - Add required ``status: Literal["ok","error","cancelled"]`` on ScopeEnd with cross-field validator: status='error' requires non-null error; status='ok' forbids non-null error; status='cancelled' allows either (spec §5.6) - Add ``ErrorInfo(type, message, stack)`` model colocated with ScopeEnd - Add typed ``profile: ToolProfile | LLMProfile | CustomProfile | None`` on ScopeStart/ScopeEnd with cross-field validator rejecting profile type mismatches against scope_type (spec §4, D-05..D-08) - Drop ``AnnotatedLLMRequest``/``AnnotatedLLMResponse`` imports from codec (D-12/D-13); ``input``/``output`` are ``Any | None`` - Collapse ``Event`` discriminated union from seven arms to three Requirements satisfied: REQ-P7-01, REQ-P7-02, REQ-P7-03, REQ-P7-04, REQ-P7-06. Spec references: atof-event-format.md §2, §3.1-3.3, §4, §5.1, §5.6, §5.7. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Rewrite src/nat/atof/__init__.py to match the three-class event model: - Drop re-exports of LLMStartEvent, LLMEndEvent, ToolStartEvent, ToolEndEvent (classes removed in 07-03) and ScopeAttributes, LLMAttributes, ToolAttributes (module removed in 07-01). - Add re-exports of Flags (07-01), ToolProfile/LLMProfile/CustomProfile (07-02), and ErrorInfo (07-03). - Preserve all 9 codec re-exports per D-14 (AnnotatedLLMRequest, AnnotatedLLMResponse, CodecContentPart, GenerationParams, Message, RequestToolCall, ResponseToolCall, ToolDefinition, Usage). - Update module docstring from "7 event types" to "3 event types (ScopeStart, ScopeEnd, Mark)" with an explicit pointer to atof-event-format.md v0.1. - __all__ now exports exactly 21 symbols (was 23). Per D-01/D-15, no deprecation shim or warnings.warn wrapper is added for the seven removed symbols; callers who import them receive a plain ImportError, which is the intended behavior for pre-release 0.x. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Replace the return-in-insertion-order behavior of read_jsonl with a sort on the computed .ts_micros property (int microseconds UTC, normalized from polymorphic timestamp str | int). This prevents downstream TypeError when events from the same stream carry mixed RFC 3339 strings and epoch integers, and gives consumers a stable, spec-aligned ordering. Closes D-11 for the read path. - read_jsonl: append events then sort by .ts_micros before return - docstring updated to reference the spec-§5.1 normalization - write_jsonl unchanged Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Aligns the converter with spec v0.1 (plan 07-03's events.py collapse). Dispatch now keys on (kind, scope_type, profile.field) tuples instead of isinstance(event, LLMStartEvent|LLMEndEvent|ToolStartEvent|ToolEndEvent) — those classes no longer exist. Polymorphic timestamp ordering is handled via .ts_micros (D-11) on every comparison path. - Imports: drop LLMStart/End and ToolStart/End; add ScopeStart/End, ToolProfile, LLMProfile, ScopeType. - Main loop: four isinstance branches collapse to three scope_type-aware branches plus the unchanged MarkEvent branch. - Tool dispatch: tool_call_id now read from event.profile.tool_call_id via isinstance(event.profile, ToolProfile) guard; falls back to the name→id map on events emitted without a ToolProfile. - LLM dispatch: tool_calls extracted from event.output only; the old event.annotated_response field (D-13) is gone. - Agent/model extraction loops use the new isinstance + scope_type discriminator pattern; model_name comes from LLMProfile.model_name. - Sort key uses .ts_micros throughout (belt-and-suspenders vs io.py). - _build_invocation_info reworked to accept int microseconds directly and divide to millisecond-precision seconds, dropping _iso_to_epoch and the datetime import. - _extract_tool_calls parameter renamed llm_output (dict | None); docstring clarified — no longer references the removed annotated_response field. Smoke: a 6-event EXMP-01-shaped stream (agent/llm/tool wrapped scopes) produces a Trajectory with the right agent/model names and user/agent/ system step sources. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
- Drop imports of LLMStart/End, ToolStart/End, AnnotatedLLMResponse, ResponseToolCall; add LLMProfile + ToolProfile imports - Rewrite EXMP-01/02/03 scenarios using ScopeStartEvent/ScopeEndEvent with scope_type discriminator, flags=[], typed profile= ToolProfile(...) or LLMProfile(...), and required status="ok" on every ScopeEnd (spec §5.6) - Every emitted event carries schema_version="0.1" transparently via _EventBase default (spec §5.7) - EXMP-01 preserves spec §7 parity: same UUIDs, timestamps, tool_call_id=call_calc_001, and model_name=nvidia/nemotron-3-super-v3 - agent/function scopes use null profile (D-08); tool/llm scopes carry matching ToolProfile/LLMProfile (D-05); validators in events.py enforce this at construction time Generated streams: exmp01 (8 events), exmp02 (12 events), exmp03 (14 events) — all validate as nat.atof.Event discriminated-union instances via read_jsonl round-trip. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
- DELETE old §4 "Scope Profiles" (v0.1 typed LLM/Tool/Custom closed catalog)
- ADD §4 "Profile Contract Protocol" — seven subsections:
§4.1 Profile object structure ($schema, $version, $mode, vendor fields)
§4.2 Schema ID format (vendor/<id>.v<N>; default/* reserved)
§4.3 Inline schema mode ($schema as JSON Schema dict)
§4.4 Stream-level mode default vs per-event override
§4.5 Producer validation contract (MUST validate, MUST raise)
§4.6 Consumer validation contract (MAY validate, MUST preserve unknown,
MUST NOT drop events with unknown $schema IDs)
§4.7 Cross-event invariance (Start/End $schema and $version equality)
- ADD §5 "Stream Header Event" — four subsections:
§5.1 Wire shape (inherits _EventBase + profile_mode_default + schemas)
§5.2 Placement rules (may appear anywhere before first non-opaque profile;
multiple headers allowed; last-wins merge; no-header default opaque)
§5.3 Mode semantics (header/inline/opaque)
§5.4 Schema registration rules (complete JSON Schema docs; $id must equal
dict key; Draft 2020-12 canonical dialect)
- ADD §6 "Reference Profile Implementations" — two subsections:
§6.1 default/llm.v1 (inline JSON Schema body; Python ref DefaultLlmV1)
§6.2 default/tool.v1 (inline JSON Schema body; Python ref DefaultToolV1)
Non-default reference profiles out of scope; additionalProperties: true
on both to permit vendor extension without subclassing
Note: §7–§12 renumbering and example rewrite land in the next commit (Task 3).
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
… modes
- Renumber §5 Event Stream Semantics → §7 (+ §7.1..§7.7 internal refs);
bump schema_version references to 0.2; add StreamHeaderEvent context to
§7.2 (parent_uuid) and §7.3 (pairing); clarify §7.7 ATOF protocol vs
profile.$version distinction
- Renumber §6 What ATOF Is Not → §8; add v0.2 contract-protocol framing;
drop stale schema_version-not-present claim (ATOF has always had one)
- Replace old §7 EXMP-01 (Simple Tool Call) + §7.1 EXMP-02 (Tool Error)
with new §9 Examples containing three v0.2 examples that demonstrate
the three profile modes:
§9.1 EXMP-01 header-mode (9 events: StreamHeaderEvent + 8 calculator)
§9.2 EXMP-02 inline-mode (13 events: header + nested weather chain)
§9.3 EXMP-03 mixed-mode (15 events: header + branching search with
per-event $mode="inline" override on the summary LLM)
- Renumber §8 Design Rationale → §10; prepend v0.2 motivation ("why
contract over enumeration" + "why a 4th event kind"); adjust internal
refs from §4 Scope Profiles → §4 Profile Contract Protocol
- Renumber §9 Roadmapped Issues → §11; DELETE old item NVIDIA#3
"Common Flags placement" entirely (W9 simpler path — resolution already
implicit in §2.1 promotion; no resolved-list subsection); add new
deferred items for vendor profile registry, URL-based schemas, and a
stream-end event
- Renumber §10 Reference Implementation → §12; add three new module rows
pointing at nat.atof.profile_contract.ProfileContract,
nat.atof.profiles.DefaultLlmV1/DefaultToolV1, and
nat.atof.validation.validate_profile
- Remove all remaining `v0.1` references (§6.1, §6.2 purpose prose,
§8 bulletin, §10 rationale) per D-23 (no migration text in v0.2 spec)
- Update stale "Section N" cross-references to "§N" format used throughout
- Note the StreamHeaderEvent tag naming convention (retains "Event"
suffix to match the kind literal) in §3.4 and §5.1
Task 3 of 3 for Plan 08-01 — Spec rewrite is now complete.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
… alias (D-11) - Add jsonschema>=4.0 to packages/nvidia_nat_atif/pyproject.toml [tool.setuptools_dynamic_dependencies] (alphabetically before pydantic) - Replace ScopeType StrEnum (11 members) with documentation-only type alias 'ScopeType = str' per spec §3.1 and D-11 (open vocabulary) - Sync uv.lock to reflect new jsonschema dependency Wave 0 infrastructure root: Tasks 2-3 and downstream plans depend on both the jsonschema library being installed and ScopeType being a plain str alias (no StrEnum shim). Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
… with caching - Add nat.atof.validation module (49 lines) - validate_profile(payload, schema) raises jsonschema.ValidationError on validation failure (spec §4, D-15) - Draft202012Validator instances cached per schema $id; unbounded cache growth acceptable per RESEARCH.md Open Question 3 (author-controlled, small vocabulary) - Module-level function callable for D-15 pluggable swapping - No __all__ — package __init__.py will re-export in Plan 08-04 Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…anchor)
- Add nat.atof.profile_contract.ProfileContract Pydantic base class
- REQUIRED $schema (schema_id: str | dict) and $version (version: str)
aliases with NO defaults — anchor for D-22 natural rejection of v0.1
profile payloads (which lack $schema)
- OPTIONAL $mode alias (Literal['header','inline','opaque'] | None)
- ConfigDict(populate_by_name=True, extra='allow') enables both Pythonic
construction and D-16 vendor-field preservation
- JSON_SCHEMA: ClassVar[dict] = {} — subclasses override in Plan 08-03
- _run_producer_validation model_validator(mode='after') calls
nat.atof.validation.validate_profile; catches Exception (not bare
jsonschema.ValidationError) for D-15 library swapping; re-raises as
ValueError so Pydantic wraps it into ValidationError
- _resolve_schema: inline dict → use directly; string ID + non-empty
subclass JSON_SCHEMA → class-level; otherwise None (opaque passthrough)
- exclude={'schema_id','version','mode'} uses Python field names (NOT
aliases) per Pydantic v2 — commented above the call (B4 fix)
See ATOF spec §4 (Profile Contract Protocol), §6 (Reference Profiles).
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…tract, scope_type str)
- Add StreamHeaderEvent as 4th event kind (spec §5) with schemas dict validator
- Retype ScopeStart/ScopeEnd `profile` as `ProfileContract | None` base (D-22 anchor — NOT subclass union; v0.1 payloads reject naturally because ProfileContract requires $schema alias)
- Retype ScopeStart/ScopeEnd `scope_type` as `str` with min_length=1 (D-10, D-11)
- Extend Event discriminated union to 4 arms with Tag("StreamHeaderEvent")
- Delete v0.1 closed-enum artifacts: _NULL_PROFILE_SCOPE_TYPES, _check_profile_scope_type, two _validate_profile_matches_scope_type wrappers, ScopeType enum import, LLMProfile/ToolProfile/CustomProfile imports
- Preserve _validate_status_error_coherence with v0.1 3-branch semantics (B1: cancelled allows either form; error requires error; ok forbids error)
- Bump _EventBase.schema_version default to "0.2" (D-20); existing 0.Y regex unchanged
- Type schemas field as dict[str, dict[str, Any]] for pyright/ty (W5)
- Keep _reject_attributes_alias, _validate_schema_version, _canonicalize_flags, ts_micros, ErrorInfo, MarkEvent intact
- Note: nat.atof.__init__ still imports old profile classes — intentional transient state; fixed in plan 08-04
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…rence impls
- Replace v0.1 ToolProfile/LLMProfile/CustomProfile with two reference profile implementations per spec §6
- DefaultLlmV1: default/llm.v1 schema, optional model_name field (preserves v0.1 LLM model_name semantics)
- DefaultToolV1: default/tool.v1 schema, optional tool_call_id field (preserves v0.1 Tool tool_call_id semantics)
- Each class declares JSON_SCHEMA: ClassVar[dict] (Draft 2020-12) inlined per spec §6.1 / §6.2
- schema_id typed as `str | dict` on both subclasses (B2 fix: union preserved for inline-mode construction via model_validate({"$schema": <dict>, ...})) — default value is a string
- additionalProperties: True as bare Python literal (B3: no conditional expression)
- extra="allow" inherited from ProfileContract base (W8 explicit discretion override of RESEARCH.md Open Question 1 — forward-compat for vendors that subclass to add provider/usage/etc. without losing those fields on round-trip)
- D-21 pure break: v0.1 ToolProfile/LLMProfile/CustomProfile are no longer importable
- Rule 3 auto-fix: update nat/atof/__init__.py to stop importing deleted v0.1 symbols (LLMProfile, ToolProfile, CustomProfile, ScopeType enum) — package imports otherwise fail; plan 08-04 will expand __init__.py with full v0.2 public API including StreamHeaderEvent, ProfileContract, DefaultLlmV1, DefaultToolV1
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…_as_any Two fixes to io.py.write_jsonl for ATOF v0.2 wire-format compliance: 1. by_alias=True — emits `$schema`/`$version`/`$mode` keys (not Pythonic `schema_id`/`version`/`mode`). Required for v0.2 spec §4 conformance; without it, wire output is silently incompatible with v0.2 consumers. 2. serialize_as_any=True — preserves subclass vendor fields on dump. `ScopeStartEvent.profile` / `ScopeEndEvent.profile` are typed as `ProfileContract | None` (base class, per D-22 natural-rejection anchor). Pydantic v2's default "by attribute typing" serialization drops subclass-only fields (DefaultLlmV1.model_name, DefaultToolV1.tool_call_id) from the output. Duck-typing mode dumps the runtime-instance fields. read_jsonl needs no changes — `TypeAdapter(Event)` transparently picks up the 4-arm union (StreamHeaderEvent included) from events.py plan 03. Verified: 4-event round-trip preserves `$schema` wire keys and vendor fields surface in profile.model_extra post-read (base-class parse). Plan: 08-04 Task 1 Phase: 08-atof-protocol-refactor-v02 Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Replaces the v0.1 isinstance(ToolProfile/LLMProfile)-based dispatch with
the v0.2 ($schema string + scope_type string) dispatch:
- Import StreamHeaderEvent, ProfileContract, DefaultLlmV1, DefaultToolV1.
Drop LLMProfile / ToolProfile / ScopeType imports (v0.1 symbols gone).
- Add `_schema_id_string(schema_id)` helper — normalizes `$schema` from
either string-ID form (`"default/tool.v1"`) or inline-schema dict form
(`{"$id": "default/tool.v1", ...}`) to a canonical string (spec §4.1).
- Add StreamHeaderEvent schema-registry pre-pass in `_events_to_step_dicts`:
builds `schema_registry` (D-08 later-wins) and `effective_mode` (D-09
"opaque" default when no header present). Reserved for future consumer-
side validation; not consumed by this commit.
- Main dispatch loop skips StreamHeaderEvent (`continue`); otherwise
dispatches on `event.scope_type == "llm"` / `"tool"` string literals
(spec §3.1, D-10) — no enum access.
- Vendor-field extraction uses `event.profile.model_dump(by_alias=True)
.get("tool_call_id" | "model_name")` at both call sites. Wire-
deserialized profiles are base `ProfileContract` instances (D-22
natural-rejection anchor); vendor fields live in `model_extra`, so
attribute access returns `None` silently (Pitfall 7). `model_dump`
surfaces them uniformly for both base-class and typed-subclass
instances.
- Reference-profile dispatch: only extract typed vendor fields when
`$schema` matches `default/llm.v1` / `default/tool.v1`; any other
schema falls through to the name-based fallback (preserves v0.1
`last_tool_call_map` lookup and `event.name` default for model_name).
- `convert()` agent-name scan updated: `scope_type == "agent"` string
literal replaces the removed `ScopeType.AGENT` enum.
Smoke verified: (1) constructed v0.2 event sequence (4 default-profile
events through `convert()`) produces a valid ATIF Trajectory with
`agent.name == "my_agent"` and `agent.model_name == "nvidia/nemotron-3-
super"`; (2) the same events written via `write_jsonl` then read via
`read_jsonl` → `convert()` produces the same trajectory, confirming the
wire-deserialized base `ProfileContract` vendor-field extraction path
works (Pitfall 7 fix).
Preserves all Phase 7 accumulator logic verbatim: `_build_ancestry`,
`_extract_tool_calls`, `_unwrap_llm_messages`, `_tool_call_order_key`,
`flush_observations`, `finalize_agent_extra`, `_build_invocation_info`,
`convert_file`, timestamp sort-by-ts_micros, status/error handling.
Plan: 08-04 Task 2
Phase: 08-atof-protocol-refactor-v02
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Supersedes the transient minimal __init__.py from plan 08-03 (which dropped v0.1 symbols to unblock package imports during rewrite). The full v0.2 surface is now re-exported from `nat.atof`: Added exports: - `StreamHeaderEvent` (4th event kind, spec §5) - `ProfileContract` (base-class typing anchor, D-22) - `DefaultLlmV1`, `DefaultToolV1` (reference profiles, spec §6) - `validate_profile` (JSON Schema Draft 2020-12 helper, D-15) Preserved exports: - `ScopeStartEvent`, `ScopeEndEvent`, `MarkEvent`, `Event`, `ErrorInfo` - `ScopeType` (now `str` alias, D-11; documentation-only name) - `Flags` - codec re-exports (AnnotatedLLMRequest/Response, CodecContentPart, GenerationParams, Message, RequestToolCall, ResponseToolCall, ToolDefinition, Usage — preserved per Phase 7 D-13) - `read_jsonl`, `write_jsonl` Removed (D-21 pure break, no shims): - `ToolProfile`, `LLMProfile`, `CustomProfile` — `from nat.atof import ToolProfile` (etc.) now raises `ImportError`. `__all__` has exactly 23 symbols alphabetically sorted. Module docstring rewritten to reflect the v0.2 profile-contract model (four event types, ProfileContract base + reference impls) and point readers at atof-event-format.md §§4–6. Verified: full 23-symbol import works; all three v0.1 profile names raise ImportError on `from nat.atof import`; `ScopeType is str`; `DefaultLlmV1(model_name="gpt-4o")` constructs; `StreamHeaderEvent` constructible with `profile_mode_default="header"`. Plan: 08-04 Task 3 Phase: 08-atof-protocol-refactor-v02 Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Regenerates EXMP-01/02/03 as v0.2 streams demonstrating the three profile declaration modes introduced by StreamHeaderEvent (spec §5): - EXMP-01 (9 events, header mode): StreamHeaderEvent declares both default/llm.v1 and default/tool.v1 schemas; profiles reference by string $schema ID. - EXMP-02 (13 events, inline mode): StreamHeaderEvent advertises profile_mode_default="inline" with empty schemas registry; each profile carries full JSON Schema body inline via DefaultLlmV1/DefaultToolV1.model_validate with $schema=<dict>. - EXMP-03 (15 events, mixed mode): StreamHeaderEvent declares header default with both schemas; final LLM event overrides via $mode="inline" + inline $schema dict to demonstrate per-event override (e.g., staging new schema version). Imports migrated from v0.1 (LLMProfile/ToolProfile) to v0.2 public API (DefaultLlmV1/DefaultToolV1/StreamHeaderEvent). Workflows preserved structurally (calculator / weather chain / branching search-and-summarize) per D-24; ATIF step counts unchanged (5 each). Filenames aligned to spec §9 convention: exmpNN_atof.jsonl. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Aligns convert_to_atif.py with the v0.2 example filenames produced by generate_examples.py (exmpNN_atof.jsonl) and the symmetric ATIF output convention (exmpNN_atif.json). No structural changes to the runner — the v0.2 converter (plan 08-04) already handles the full 4-kind event model including StreamHeaderEvent transparently. Audit confirms no v0.1 symbols (LLMProfile/ToolProfile/CustomProfile/ ScopeType.*) in the runner. End-to-end smoke passes: three ATIFTrajectory JSONs produced, each with 5 steps and correct agent name + model_name (nvidia/nemotron-3-super-v3). Observation counts match Phase 7 baseline (EXMP-01=1, EXMP-02=2, EXMP-03=3 observations aggregated) — ATIF step/observation shape unchanged from v0.1 per D-25. Also checks in the three regenerated v0.2 JSONL streams and their corresponding ATIF JSON outputs as reference data: - output/exmp01_atof.jsonl + exmp01_atif.json (header mode) - output/exmp02_atof.jsonl + exmp02_atif.json (inline mode) - output/exmp03_atof.jsonl + exmp03_atif.json (mixed mode) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Documents the three profile declaration modes (header, inline, mixed) introduced by StreamHeaderEvent in spec v0.2 §5, mapping each to its canonical example scenario (EXMP-01/02/03). Includes: - Short scripts overview (generate_examples.py, convert_to_atif.py) - One paragraph per mode with "When to use" guidance - Running commands in a fenced bash block - Event/step count table - Cross-references to atof-event-format.md §4-6 spec sections and to the reference profile implementations in profiles.py Per RESEARCH.md Claude's Discretion recommendation (README useful for newcomers) and W7 (no emoji-detection; rely on reviewer discipline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…onError Replace bare ``except Exception`` in ``ProfileContract._run_producer_validation`` with a specific ``jsonschema.exceptions.ValidationError`` catch. This aligns with NAT's error-handling convention (specific exception types, not bare ``Exception``) and matches the documented contract in ``validate_profile``'s docstring. Programming errors (TypeError from malformed inline schema, AttributeError from a misconfigured swapped validator, etc.) now propagate unchanged instead of being masked as "Profile failed \$schema validation". Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Drop the unused ``from typing import TYPE_CHECKING`` import and the empty ``if TYPE_CHECKING: pass`` block. Leftover scaffolding with no forward references gated behind it; removing makes the module's intent clearer. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Add a "Known limitation" section to the converter module docstring and an inline comment at the map-build site describing when the name-keyed ``last_tool_call_map`` fallback can collide (two tool calls with the same ``function_name`` within a single LLM turn, on opaque/vendor-profile streams that lack explicit ``tool_call_id``). The primary v0.2 reference path (``default/tool.v1`` profile with explicit ``tool_call_id``) reads the id directly before consulting the fallback map, so reference streams are unaffected. Documents the constraint for consumers of opaque-profile streams and leaves a pointer to a future FIFO-queue refactor. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Mirror ``ProfileContract``'s ``ConfigDict(populate_by_name=True, extra="allow")`` on ``_EventBase`` and ``ErrorInfo``. None of today's event-class fields declare aliases, so this is inert at current call sites — but any future ``$``-prefixed wire meta-field added at the event level will otherwise silently reject Python-name construction. Pre-emptively widens accepted input; documents the intent with an inline comment pointing to ``ProfileContract`` parity. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…t column Both flags described the common case (most scopes are stateless/in-process), violating the "absence is the default" rule in §2.1. Flag names now describe the exceptional runtime property; absence means the documented default applies. Spec §2.1 table gains a 'Default (absent)' column to remove ambiguity for implementers. Pre-release breaking change — no external consumers yet. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…der' The wire kind literal was the only event-kind discriminator that retained the 'Event' suffix, breaking convention with ScopeStart/ScopeEnd/Mark. Phase 8-RESEARCH Pitfall 3 documented the inconsistency but kept it for verbatim alignment with the Python class name; that justification was circular — change the literal, change the Tag value, problem gone. Renames the wire literal only. Python class name 'StreamHeaderEvent' is unchanged. Pre-release breaking change — no external consumers yet. Changes: - events.py: kind literal + Tag value updated; drop apologetic comment - atof-event-format.md: §1.1 list, §3.4 table, §5 prose, EXMP-01/02/03 JSONL examples updated; drop apologetic note - output/exmp0[123]_atof.jsonl: regenerated with 'StreamHeader' kind Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…h case) Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
… ISO 8601) Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Closes the dangling references in atof-event-format.md §2, §7.1, §8, §10 to a doc that didn't exist. Documents the canonical scope_type → ATIF source mapping, the accumulator state machine, ID/timing mappings, ancestry reconstruction, StreamHeaderEvent handling, known limitations (WR-03, CR-01, HI-01), and the public API. Resolves the open question 'how am I supposed to identify user/agent/ system source fields when converting from ATOF to ATIF.' Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
…ontract - Bump front-matter Version to 0.2, Date to 2026-04-14, Status: Active retained - Reframe §1 Overview as a profile-contract protocol (not a scope-type catalog) - Add 4th event kind StreamHeaderEvent to §1 kind list and §3 event inventory - §2 retains envelope fields; promote `flags` to new §2.1 (cross-cutting, not profile-nested) (resolves Phase 7 §9 backlog item on Common Flags placement) - §2 documents schema_version as ATOF protocol version; clarifies separation from profile.$version - §3.1/§3.2 change scope_type from closed enum to open string (non-empty, vendor-extensible) with conventional values listed as non-normative - §3.1/§3.2 retype `profile` field as `Profile (§4) or null` (profile-contract protocol) - §3.2 adds cross-event invariance note pointing to §4.7 - §3.3 MarkEvent gains `kind` row for explicit discriminator documentation - §3.4 StreamHeaderEvent (NEW) — full field table with profile_mode_default and schemas Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
… (Option C)
Complete rewrite of atof-event-format.md + new atof-codec-profiles.md
companion doc per Phase 9 design decisions (Q11 = Option C).
Core design shifts from v0.2:
- 7 event kinds (LlmStart/End + ToolStart/End as first-class, alongside
ScopeStart/End + Mark) — matches NeMo-Flow's actual emitted shape
- Drop profile-contract design entirely (no $schema/$version/$mode,
no StreamHeaderEvent, no §4/§5/§6 of v0.2)
- Typed sidecar fields per event kind (model_name, tool_call_id,
scope_type, subtype) replace the generic profile field
- Closed scope_type enum with 'custom' + 'subtype' escape hatch and
'unknown' for tier-1 pass-through
- Required 'status' on all *EndEvents (ok | error | cancelled) closes
the v0.2 replay-completeness gap
- 'schema_version' required on every event for forward-compat dispatch
- Optional 'codec' field + 'annotated_request'/'annotated_response' as
the tier-3 validation layer (companion doc)
- Three enrichment tiers documented: raw pass-through, semantic-tagged,
codec-annotated
- attributes as lowercase string arrays (Q14 JSONL round-trip robust)
- Attribute vocabulary reverts to runtime words (stateless, local)
- Mark events stay point-in-time only (Q15); status lives on EndEvents
Companion doc atof-codec-profiles.md defines:
- Codec identifier wire format ({name, version})
- Canonical codec registry (default/passthrough, default/opaque,
openai/chat-completions, anthropic/messages, nvidia/llm, nvidia/tools)
- Canonical structured shapes (AnnotatedLlmRequest, AnnotatedLlmResponse,
AnnotatedToolInvocation)
- Repurposes the Phase-8-drafted nvidia/llm.v1 + nvidia/tools.v1 JSON
Schemas as codec output schemas (no orphaning)
- Three-tier producer guidance with examples
Appendix A documents the v0.2 → v0.3 migration table (pure break,
no transitional shim, regenerate examples per Q16).
No source code changes yet — plan 09-02 will refactor events.py /
io.py / converter / __init__.py to match this spec.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new nat.atof package implementing the ATOF JSONL event format (models, flags, category), JSONL I/O, an ATOF→ATIF converter with example generator/converter scripts, comprehensive tests, ATIF model extensions (v1.7), package metadata update, and a small Vale vocabulary update. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/Script
participant Reader as ATOF JSONL Reader\n(read_jsonl)
participant Validator as Event Validator\n(TypeAdapter(Event))
participant Converter as ATOF→ATIF Converter\n(convert)
participant Writer as ATIF Writer\n(convert_file / write JSON)
rect rgba(200,200,255,0.5)
User->>Reader: provide input JSONL path
end
Reader->>Validator: parse lines, validate events
Validator-->>Reader: typed Event objects
Reader->>Converter: sorted list[Event]
Converter->>Converter: build ancestry, buffer tool events,\nflush on LLM scope-end, map to step dicts, handle subagents
Converter-->>Writer: Trajectory
Writer->>User: write output JSON (optional) and return Trajectory
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (6)
packages/nvidia_nat_atif/examples/atof_to_atif/generate_examples.py (1)
3-7: Docstring is slightly misleading about StreamHeader.The module docstring states "Each scenario opens with a
StreamHeaderEvent" but EXMP-01 (tier-1) deliberately omits the header. The detailed EXMP-01 description in the docstring correctly notes this, but the opening statement is inaccurate.📝 Suggested fix
-Each scenario opens with a ``StreamHeaderEvent`` (always at position 0 per -spec §3.4). The three main scenarios walk through tiers 1 → 2 → 3 in order; +Tier-2 and tier-3 scenarios open with a ``StreamHeaderEvent`` (always at +position 0 per spec §3.4 when present). The three main scenarios walk through +tiers 1 → 2 → 3 in order;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/examples/atof_to_atif/generate_examples.py` around lines 3 - 7, Update the module docstring's opening statement to accurately reflect that most scenarios open with a StreamHeaderEvent but that EXMP-01 (tier-1) intentionally omits it; locate the top-of-file docstring in generate_examples.py and modify the sentence "Each scenario opens with a ``StreamHeaderEvent`` (always at position 0 per spec §3.4)." to mention the exception for EXMP-01 (or state "most scenarios") while leaving the detailed EXMP-01 description unchanged.packages/nvidia_nat_atif/examples/atof_to_atif/README.md (3)
31-34: Avoid possessive 's with inanimate objects.Per coding guidelines, documentation should not use possessive 's with inanimate objects. Consider rephrasing "the reference converter's generic
ScopeEndfall-through" to "the genericScopeEndfall-through of the reference converter".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/examples/atof_to_atif/README.md` around lines 31 - 34, Change the possessive phrasing in the README sentence that currently reads "the reference converter's generic `ScopeEnd` fall-through" to use "of" instead (e.g., "the generic `ScopeEnd` fall-through of the reference converter"); update the same sentence that mentions `Trajectory.agent.name` and `scope_type: "agent"` if needed to keep grammar consistent with the new phrasing.
248-250: Spell out "NeMo Agent Toolkit" instead of using "NAT" acronym.Per coding guidelines, documentation should spell out "NeMo Agent Toolkit" rather than using "NAT" as an acronym. The exception is code identifiers like
nat.atifwhich should be in backticks.📝 Suggested fix
-Both return a Pydantic-validated `nat.atif.trajectory.Trajectory` (the -NAT-side ATIF model that mirrors Harbor's `0001-trajectory-format.md` RFC +Both return a Pydantic-validated `nat.atif.trajectory.Trajectory` (the +NeMo Agent Toolkit ATIF model that mirrors Harbor's `0001-trajectory-format.md` RFC🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/examples/atof_to_atif/README.md` around lines 248 - 250, Replace the acronym "NAT" with the full name "NeMo Agent Toolkit" in the README sentence while preserving code identifiers; e.g., change "the NAT-side ATIF model" to "the NeMo Agent Toolkit-side ATIF model" (keep `nat.atif.trajectory.Trajectory` and `nat.atif` in backticks) and ensure references to Harbor's RFC `0001-trajectory-format.md` v1.7 remain unchanged.
1-259: Missing SPDX license header.Markdown files in this repository should include the SPDX Apache-2.0 header at the top. Add the standard license comment block.
📝 Suggested fix
Add at the beginning of the file:
<!-- SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. --> <!-- SPDX-License-Identifier: Apache-2.0 --> # ATOF-to-ATIF Examples ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/examples/atof_to_atif/README.md` around lines 1 - 259, Add the missing SPDX Apache-2.0 header to the top of the README: insert the two HTML comment lines shown in the review (SPDX-FileCopyrightText and SPDX-License-Identifier) immediately before the existing first heading "# ATOF-to-ATIF Examples" so the file begins with the license block followed by the current content.packages/nvidia_nat_atif/atof-schema-profiles.md (1)
1-335: Missing SPDX license header.Per coding guidelines, all source files should include the SPDX Apache-2.0 header.
📝 Suggested fix
Add at the beginning of the file:
<!-- SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. --> <!-- SPDX-License-Identifier: Apache-2.0 --> # ATOF Schema Profiles ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/atof-schema-profiles.md` around lines 1 - 335, Add the SPDX Apache-2.0 license header as the very first lines of the document before the "# ATOF Schema Profiles" heading: insert the two HTML comment lines for copyright and SPDX identifier (SPDX-FileCopyrightText and SPDX-License-Identifier) at the top of packages/nvidia_nat_atif/atof-schema-profiles.md so the file begins with the SPDX header immediately followed by the existing "# ATOF Schema Profiles" content.packages/nvidia_nat_atif/atof-event-format.md (1)
1-316: Missing SPDX license header.Per coding guidelines, all source files should include the SPDX Apache-2.0 header. Add the standard license comment block at the top of this specification document.
📝 Suggested fix
Add at the beginning of the file:
<!-- SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. --> <!-- SPDX-License-Identifier: Apache-2.0 --> # Agentic Trajectory Observability Format (ATOF) Specification — Core ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/atof-event-format.md` around lines 1 - 316, Insert the standard SPDX Apache-2.0 license header as two HTML comment lines at the very top of the document, immediately above the existing top-level heading "# Agentic Trajectory Observability Format (ATOF) Specification — Core"; ensure the comments include the SPDX-FileCopyrightText line with NVIDIA's copyright and the SPDX-License-Identifier: Apache-2.0 identifier so the file starts with the required license block before any content.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/nvidia_nat_atif/examples/atof_to_atif/convert_to_atif.py`:
- Line 39: Update the CLI description string passed to argparse.ArgumentParser
to reference the correct ATOF version (v0.1) instead of "ATOF v0.2"; locate the
parser initialization line `parser = argparse.ArgumentParser(description="...")`
in convert_to_atif.py and change the description to the accurate text (e.g.,
"Convert ATOF v0.1 JSONL to ATIF JSON") so docs and CLI remain consistent.
In `@packages/nvidia_nat_atif/src/nat/atof/__init__.py`:
- Around line 15-16: The module-level docstring in the atof package contains a
stray '+' before "4-priority schema resolution protocol (§7.1)"—remove that
extraneous '+' from the module docstring (in the top-of-file string in
__init__.py of the atof module) so the sentence reads correctly without the edit
artifact.
In `@packages/nvidia_nat_atif/src/nat/atof/io.py`:
- Around line 31-38: The reader currently uses path.read_text().splitlines()
which buffers the whole file; change the read loop to stream lines from the file
(with path.open('r', encoding='utf-8') and for line in f:) and call
_event_adapter.validate_python(raw) per-line to build events without reading the
entire file into memory, then sort and return as before; likewise replace any
path.write_text(...) bulk-write (lines 49-54) with an incremental writer using
path.open('w', encoding='utf-8') and write each serialized event plus a newline
(e.g., json.dumps(...) + "\n") to avoid buffering the full output in memory.
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/__init__.py`:
- Around line 1-14: Add a Google-style module docstring at the top of this
package's __init__.py (module nat.atof.scripts): insert a triple-quoted
docstring immediately after the file header that gives a one-line summary, a
short description of the module’s purpose, and any public exports/attributes
(e.g., listed under "Attributes" or "Exports" if applicable). Ensure the
docstring follows Google style (short summary line, blank line, optional longer
description and section headers) and is the first statement in the module so
tooling and linters recognize it.
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 94-102: The code currently accesses tc["id"] directly in both
branches which can raise KeyError for malformed tool calls; update the
extraction to use inner.get("id") and tc.get("id") with a sensible fallback
(e.g., generate a synthetic id using uuid4() or an incremental placeholder) and
assign that to tool_id so downstream code always has a stable identifier; modify
the block that sets tool_id (and any other places using tc["id"]) to use
.get(...) and fallback generation to avoid crashes.
---
Nitpick comments:
In `@packages/nvidia_nat_atif/atof-event-format.md`:
- Around line 1-316: Insert the standard SPDX Apache-2.0 license header as two
HTML comment lines at the very top of the document, immediately above the
existing top-level heading "# Agentic Trajectory Observability Format (ATOF)
Specification — Core"; ensure the comments include the SPDX-FileCopyrightText
line with NVIDIA's copyright and the SPDX-License-Identifier: Apache-2.0
identifier so the file starts with the required license block before any
content.
In `@packages/nvidia_nat_atif/atof-schema-profiles.md`:
- Around line 1-335: Add the SPDX Apache-2.0 license header as the very first
lines of the document before the "# ATOF Schema Profiles" heading: insert the
two HTML comment lines for copyright and SPDX identifier (SPDX-FileCopyrightText
and SPDX-License-Identifier) at the top of
packages/nvidia_nat_atif/atof-schema-profiles.md so the file begins with the
SPDX header immediately followed by the existing "# ATOF Schema Profiles"
content.
In `@packages/nvidia_nat_atif/examples/atof_to_atif/generate_examples.py`:
- Around line 3-7: Update the module docstring's opening statement to accurately
reflect that most scenarios open with a StreamHeaderEvent but that EXMP-01
(tier-1) intentionally omits it; locate the top-of-file docstring in
generate_examples.py and modify the sentence "Each scenario opens with a
``StreamHeaderEvent`` (always at position 0 per spec §3.4)." to mention the
exception for EXMP-01 (or state "most scenarios") while leaving the detailed
EXMP-01 description unchanged.
In `@packages/nvidia_nat_atif/examples/atof_to_atif/README.md`:
- Around line 31-34: Change the possessive phrasing in the README sentence that
currently reads "the reference converter's generic `ScopeEnd` fall-through" to
use "of" instead (e.g., "the generic `ScopeEnd` fall-through of the reference
converter"); update the same sentence that mentions `Trajectory.agent.name` and
`scope_type: "agent"` if needed to keep grammar consistent with the new
phrasing.
- Around line 248-250: Replace the acronym "NAT" with the full name "NeMo Agent
Toolkit" in the README sentence while preserving code identifiers; e.g., change
"the NAT-side ATIF model" to "the NeMo Agent Toolkit-side ATIF model" (keep
`nat.atif.trajectory.Trajectory` and `nat.atif` in backticks) and ensure
references to Harbor's RFC `0001-trajectory-format.md` v1.7 remain unchanged.
- Around line 1-259: Add the missing SPDX Apache-2.0 header to the top of the
README: insert the two HTML comment lines shown in the review
(SPDX-FileCopyrightText and SPDX-License-Identifier) immediately before the
existing first heading "# ATOF-to-ATIF Examples" so the file begins with the
license block followed by the current content.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 121abfc4-7263-4a53-85c3-5d366e90b537
⛔ Files ignored due to path filters (2)
packages/nvidia_nat_atif/uv.lockis excluded by!**/*.lockuv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
ci/vale/styles/config/vocabularies/nat/accept.txtpackages/nvidia_nat_atif/atof-event-format.mdpackages/nvidia_nat_atif/atof-schema-profiles.mdpackages/nvidia_nat_atif/examples/atof_to_atif/README.mdpackages/nvidia_nat_atif/examples/atof_to_atif/convert_to_atif.pypackages/nvidia_nat_atif/examples/atof_to_atif/generate_examples.pypackages/nvidia_nat_atif/pyproject.tomlpackages/nvidia_nat_atif/src/nat/atof/__init__.pypackages/nvidia_nat_atif/src/nat/atof/annotations.pypackages/nvidia_nat_atif/src/nat/atof/events.pypackages/nvidia_nat_atif/src/nat/atof/flags.pypackages/nvidia_nat_atif/src/nat/atof/io.pypackages/nvidia_nat_atif/src/nat/atof/scope_type.pypackages/nvidia_nat_atif/src/nat/atof/scripts/__init__.pypackages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py
|
|
||
|
|
||
| def main() -> None: | ||
| parser = argparse.ArgumentParser(description="Convert ATOF v0.2 JSONL to ATIF JSON") |
There was a problem hiding this comment.
CLI description still references the wrong version.
Line 39 says ATOF v0.2, but this script and module docs are for v0.1.
Proposed fix
- parser = argparse.ArgumentParser(description="Convert ATOF v0.2 JSONL to ATIF JSON")
+ parser = argparse.ArgumentParser(description="Convert ATOF v0.1 JSONL to ATIF JSON")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| parser = argparse.ArgumentParser(description="Convert ATOF v0.2 JSONL to ATIF JSON") | |
| parser = argparse.ArgumentParser(description="Convert ATOF v0.1 JSONL to ATIF JSON") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/examples/atof_to_atif/convert_to_atif.py` at line
39, Update the CLI description string passed to argparse.ArgumentParser to
reference the correct ATOF version (v0.1) instead of "ATOF v0.2"; locate the
parser initialization line `parser = argparse.ArgumentParser(description="...")`
in convert_to_atif.py and change the description to the accurate text (e.g.,
"Convert ATOF v0.1 JSONL to ATIF JSON") so docs and CLI remain consistent.
| for line in path.read_text().splitlines(): | ||
| line = line.strip() | ||
| if not line: | ||
| continue | ||
| raw = json.loads(line) | ||
| events.append(_event_adapter.validate_python(raw)) | ||
| events.sort(key=lambda e: e.ts_micros) | ||
| return events |
There was a problem hiding this comment.
Avoid full-file buffering in JSONL I/O paths.
Line 31 and Line 49 currently load/build the entire stream in memory. For large traces, this creates avoidable memory pressure and slower processing.
Proposed fix
def read_jsonl(path: str | Path) -> list[Event]:
@@
- for line in path.read_text().splitlines():
- line = line.strip()
+ with path.open("r", encoding="utf-8") as f:
+ for line in f:
+ line = line.strip()
+ if not line:
+ continue
+ raw = json.loads(line)
+ events.append(_event_adapter.validate_python(raw))
- if not line:
- continue
- raw = json.loads(line)
- events.append(_event_adapter.validate_python(raw))
events.sort(key=lambda e: e.ts_micros)
return events
@@
def write_jsonl(events: list[Event], path: str | Path) -> None:
@@
- lines = []
- for event in events:
- # Exclude the computed ``ts_micros`` field from wire output — it's an
- # in-memory sorting convenience, not part of the wire envelope (spec §2).
- lines.append(json.dumps(event.model_dump(exclude_none=True, exclude={"ts_micros"}, mode="json", by_alias=True)))
- path.write_text("\n".join(lines) + "\n")
+ with path.open("w", encoding="utf-8") as f:
+ for event in events:
+ # Exclude the computed ``ts_micros`` field from wire output — it's an
+ # in-memory sorting convenience, not part of the wire envelope (spec §2).
+ f.write(json.dumps(event.model_dump(exclude_none=True, exclude={"ts_micros"}, mode="json", by_alias=True)))
+ f.write("\n")Also applies to: 49-54
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/src/nat/atof/io.py` around lines 31 - 38, The reader
currently uses path.read_text().splitlines() which buffers the whole file;
change the read loop to stream lines from the file (with path.open('r',
encoding='utf-8') and for line in f:) and call
_event_adapter.validate_python(raw) per-line to build events without reading the
entire file into memory, then sort and return as before; likewise replace any
path.write_text(...) bulk-write (lines 49-54) with an incremental writer using
path.open('w', encoding='utf-8') and write each serialized event plus a newline
(e.g., json.dumps(...) + "\n") to avoid buffering the full output in memory.
| # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. |
There was a problem hiding this comment.
Add a module docstring to satisfy Python documentation requirements.
This __init__.py is currently header-only; please add a concise Google-style module docstring.
Suggested patch
# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
@@
# See the License for the specific language governing permissions and
# limitations under the License.
+
+"""Initialize ATOF script utilities package."""As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
| # SPDX-License-Identifier: Apache-2.0 | |
| # | |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |
| # you may not use this file except in compliance with the License. | |
| # You may obtain a copy of the License at | |
| # | |
| # http://www.apache.org/licenses/LICENSE-2.0 | |
| # | |
| # Unless required by applicable law or agreed to in writing, software | |
| # distributed under the License is distributed on an "AS IS" BASIS, | |
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
| # See the License for the specific language governing permissions and | |
| # limitations under the License. | |
| # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
| # SPDX-License-Identifier: Apache-2.0 | |
| # | |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |
| # you may not use this file except in compliance with the License. | |
| # You may obtain a copy of the License at | |
| # | |
| # http://www.apache.org/licenses/LICENSE-2.0 | |
| # | |
| # Unless required by applicable law or agreed to in writing, software | |
| # distributed under the License is distributed on an "AS IS" BASIS, | |
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
| # See the License for the specific language governing permissions and | |
| # limitations under the License. | |
| """Initialize ATOF script utilities package.""" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/__init__.py` around lines 1 -
14, Add a Google-style module docstring at the top of this package's __init__.py
(module nat.atof.scripts): insert a triple-quoted docstring immediately after
the file header that gives a one-line summary, a short description of the
module’s purpose, and any public exports/attributes (e.g., listed under
"Attributes" or "Exports" if applicable). Ensure the docstring follows Google
style (short summary line, blank line, optional longer description and section
headers) and is the first statement in the module so tooling and linters
recognize it.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py (1)
92-100:⚠️ Potential issue | 🟡 MinorGuard missing tool-call IDs when normalizing payloads.
tc["id"]still raisesKeyErroron malformed tool-call payloads and aborts the whole conversion. Use.get()plus a fallback identifier instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 92 - 100, The code directly indexes tc["id"] (and similarly assumes inner contains an "id") which raises KeyError for malformed tool-call payloads; change both occurrences to use .get("id", fallback) — e.g., tool_id = inner.get("id", f"<missing-tool-id-{tc.get('name','unknown')}>") when inner is used, and tool_id = tc.get("id", f"<missing-tool-id-{tc.get('name','unknown')}>") in the else branch — so normalization uses a stable fallback identifier instead of aborting when id is absent; keep name = ... and args = ... as-is but prefer .get there too if not already.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/nvidia_nat_atif/src/nat/atof/events.py`:
- Around line 222-237: MarkEvent currently allows category == "" which should be
treated as uncategorized like None; update the validator on MarkEvent (the
`@model_validator` method _validate_category_subtype_coherence) to reject
empty-string categories the same way ScopeEvent does: after calling
_require_subtype_when_custom(self.category, self.category_profile) also validate
that if self.category is not None it is a non-empty string (e.g. raise
ValueError or TypeError when self.category == "" or self.category.strip() ==
""), ensuring the category field / category_profile contract is enforced for
MarkEvent.
- Around line 115-157: Add a validator for the timestamp field so invalid
timestamp strings are rejected at model construction: implement a
`@field_validator`("timestamp") classmethod (e.g., _validate_timestamp) that
accepts the declared union type (str|int), ensures ints are non-negative
epoch-microseconds and that strings parse as RFC3339 (use
datetime.fromisoformat(self.timestamp.replace("Z", "+00:00")) or equivalent) and
raise ValueError on parse/fmt errors; return the original value (or a normalized
int if you prefer) so ts_micros can rely on a validated timestamp.
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 268-276: The code falls back to serializing the whole LLM envelope
into Step.message when raw_data is present but contains a flat "content" string
(e.g., {"content":"..."}); update the assignment logic around agent_msg and the
fallback in the try/except so you first attempt to read raw_data["content"] (and
prefer raw_data["choices"][0]["message"]["content"] when present) before doing
json.dumps(raw_data); modify the code that sets agent_msg (and where
step_dict["message"] is filled) to: 1) try choices[0].message.content, 2) if
missing try raw_data["content"], and only 3) if neither exists fall back to
json.dumps(raw_data) or empty string so Step.message contains the assistant text
when available.
- Around line 392-420: Don't hard-code session_id or Agent.version: remove the
fixed session_id = "atof-session" assignment and stop passing a hard-coded
Agent(version="1.0.0") into Trajectory; instead allow Trajectory() to use its
default UUID (omit session_id argument) and derive the agent version from the
input ATOF metadata if available (e.g., a metadata field parsed earlier) or pass
a sentinel like "unknown" when no version exists; update the Trajectory(...)
call that currently sets session_id and Agent(...) (refer to the session_id
variable and the Agent(name=..., version="1.0.0", model_name=model_name)
construction) so the code reads version from the parsed metadata or uses
"unknown" and does not force a global static session id.
---
Duplicate comments:
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 92-100: The code directly indexes tc["id"] (and similarly assumes
inner contains an "id") which raises KeyError for malformed tool-call payloads;
change both occurrences to use .get("id", fallback) — e.g., tool_id =
inner.get("id", f"<missing-tool-id-{tc.get('name','unknown')}>") when inner is
used, and tool_id = tc.get("id",
f"<missing-tool-id-{tc.get('name','unknown')}>") in the else branch — so
normalization uses a stable fallback identifier instead of aborting when id is
absent; keep name = ... and args = ... as-is but prefer .get there too if not
already.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 6bff3780-27fc-4073-a1ab-48e901aebbb0
📒 Files selected for processing (9)
packages/nvidia_nat_atif/examples/atof_to_atif/README.mdpackages/nvidia_nat_atif/examples/atof_to_atif/convert_atof_examples_to_atif.pypackages/nvidia_nat_atif/examples/atof_to_atif/generate_atof_examples.pypackages/nvidia_nat_atif/src/nat/atof/__init__.pypackages/nvidia_nat_atif/src/nat/atof/category.pypackages/nvidia_nat_atif/src/nat/atof/events.pypackages/nvidia_nat_atif/src/nat/atof/flags.pypackages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.pypackages/nvidia_nat_atif/tests/test_tier1_conversion.py
✅ Files skipped from review due to trivial changes (4)
- packages/nvidia_nat_atif/src/nat/atof/category.py
- packages/nvidia_nat_atif/src/nat/atof/init.py
- packages/nvidia_nat_atif/src/nat/atof/flags.py
- packages/nvidia_nat_atif/examples/atof_to_atif/README.md
| timestamp: str | int = Field(description="Wall-clock time: RFC 3339 string OR int epoch microseconds (spec §5.1)") | ||
| name: str = Field(description="Human-readable label") | ||
| data: Any | None = Field(default=None, description="Application-defined payload; opaque to ATOF") | ||
| data_schema: dict[str, Any] | None = Field( | ||
| default=None, | ||
| description=( | ||
| "Schema identifier {name, version} describing the shape of ``data``. " | ||
| "Opaque to ATOF core; validation against the named schema is the " | ||
| "consumer's responsibility (spec §2, §3)." | ||
| ), | ||
| ) | ||
| metadata: dict[str, Any] | None = Field(default=None, description="Tracing/correlation envelope") | ||
|
|
||
| model_config = ConfigDict(extra="allow") | ||
|
|
||
| @field_validator("atof_version") | ||
| @classmethod | ||
| def _validate_atof_version(cls, v: str) -> str: | ||
| if not _ATOF_VERSION_PATTERN.match(v): | ||
| raise ValueError(f"atof_version must match '0.MINOR' (e.g., '0.1'), got '{v}'") | ||
| return v | ||
|
|
||
| @field_validator("uuid", "parent_uuid") | ||
| @classmethod | ||
| def _validate_uuid_non_empty(cls, v: str | None) -> str | None: | ||
| if v is None: | ||
| return None | ||
| if not isinstance(v, str) or not v: | ||
| raise ValueError("uuid / parent_uuid must be a non-empty string when set") | ||
| return v | ||
|
|
||
| @computed_field # type: ignore[prop-decorator] | ||
| @property | ||
| def ts_micros(self) -> int: | ||
| """Timestamp normalized to int epoch microseconds (spec §5.1). | ||
|
|
||
| Not emitted on the wire (excluded by ``io.write_jsonl``). For | ||
| in-memory sorting and consumer-side comparison only. | ||
| """ | ||
| if isinstance(self.timestamp, int): | ||
| return self.timestamp | ||
| dt = datetime.fromisoformat(self.timestamp.replace("Z", "+00:00")) | ||
| return int(dt.timestamp() * 1_000_000) |
There was a problem hiding this comment.
Validate timestamp before the model escapes Pydantic.
Arbitrary strings currently pass construction and only blow up later when ts_micros is touched during sorting/conversion. That turns a schema error into a runtime failure path.
Suggested fix
class _EventBase(BaseModel):
@@
`@field_validator`("uuid", "parent_uuid")
`@classmethod`
def _validate_uuid_non_empty(cls, v: str | None) -> str | None:
if v is None:
return None
if not isinstance(v, str) or not v:
raise ValueError("uuid / parent_uuid must be a non-empty string when set")
return v
+
+ `@field_validator`("timestamp")
+ `@classmethod`
+ def _validate_timestamp(cls, v: str | int) -> str | int:
+ if isinstance(v, bool):
+ raise ValueError("timestamp must be an RFC 3339 string or epoch microseconds")
+ if isinstance(v, int):
+ return v
+ try:
+ datetime.fromisoformat(v.replace("Z", "+00:00"))
+ except ValueError as exc:
+ raise ValueError("timestamp must be an RFC 3339 string or epoch microseconds") from exc
+ return v🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/src/nat/atof/events.py` around lines 115 - 157, Add
a validator for the timestamp field so invalid timestamp strings are rejected at
model construction: implement a `@field_validator`("timestamp") classmethod (e.g.,
_validate_timestamp) that accepts the declared union type (str|int), ensures
ints are non-negative epoch-microseconds and that strings parse as RFC3339 (use
datetime.fromisoformat(self.timestamp.replace("Z", "+00:00")) or equivalent) and
raise ValueError on parse/fmt errors; return the original value (or a normalized
int if you prefer) so ts_micros can rely on a validated timestamp.
| category: str | None = Field( | ||
| default=None, | ||
| description="Semantic category (spec §4). Null or absent means the mark is a generic checkpoint.", | ||
| ) | ||
| category_profile: dict[str, Any] | None = Field( | ||
| default=None, | ||
| description=( | ||
| "Category-specific typed fields (spec §4.4). REQUIRED when " | ||
| "category == 'custom' (must carry category_profile.subtype)." | ||
| ), | ||
| ) | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_category_subtype_coherence(self) -> Self: | ||
| _require_subtype_when_custom(self.category, self.category_profile) | ||
| return self |
There was a problem hiding this comment.
Reject empty-string categories on MarkEvent.
None is the valid “uncategorized” case, but "" currently validates and bypasses the category contract. MarkEvent should mirror the non-empty check already applied on ScopeEvent.
Suggested fix
class MarkEvent(_EventBase):
@@
category_profile: dict[str, Any] | None = Field(
default=None,
description=(
"Category-specific typed fields (spec §4.4). REQUIRED when "
"category == 'custom' (must carry category_profile.subtype)."
),
)
+
+ `@field_validator`("category")
+ `@classmethod`
+ def _validate_category(cls, v: str | None) -> str | None:
+ if v is not None and not v:
+ raise ValueError("category must be null/omitted or a non-empty string")
+ return v
`@model_validator`(mode="after")
def _validate_category_subtype_coherence(self) -> Self:
_require_subtype_when_custom(self.category, self.category_profile)
return self🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/src/nat/atof/events.py` around lines 222 - 237,
MarkEvent currently allows category == "" which should be treated as
uncategorized like None; update the validator on MarkEvent (the `@model_validator`
method _validate_category_subtype_coherence) to reject empty-string categories
the same way ScopeEvent does: after calling
_require_subtype_when_custom(self.category, self.category_profile) also validate
that if self.category is not None it is a non-empty string (e.g. raise
ValueError or TypeError when self.category == "" or self.category.strip() ==
""), ensuring the category field / category_profile contract is enforced for
MarkEvent.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
Open Item: ATOF -> ATIF converter should not have any implementation specific details. "abstract machine model for conversion" |
|
Open item: decide on enrichment of "category_profile" after pilot integration |
|
Open item: atomic use case examples + long-running agent |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
packages/nvidia_nat_atif/src/nat/atof/events.py (2)
146-157:⚠️ Potential issue | 🟠 MajorValidate
timestampduring model validation, not lazily ints_micros.Right now invalid timestamp strings can construct successfully and fail later when
ts_microsis accessed (Line 156), which shifts schema errors into runtime paths.✅ Proposed fix
class _EventBase(BaseModel): @@ `@field_validator`("uuid", "parent_uuid") `@classmethod` def _validate_uuid_non_empty(cls, v: str | None) -> str | None: @@ if not isinstance(v, str) or not v: raise ValueError("uuid / parent_uuid must be a non-empty string when set") return v + + `@field_validator`("timestamp") + `@classmethod` + def _validate_timestamp(cls, v: str | int) -> str | int: + if isinstance(v, bool): + raise ValueError("timestamp must be an RFC 3339 string or epoch microseconds int") + if isinstance(v, int): + return v + try: + datetime.fromisoformat(v.replace("Z", "+00:00")) + except ValueError as exc: + raise ValueError("timestamp must be an RFC 3339 string or epoch microseconds int") from exc + return v🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/events.py` around lines 146 - 157, The ts_micros property currently parses self.timestamp at access time and raises on invalid strings; move this validation into the model's validation phase so invalid timestamps fail model construction instead of at runtime. Add a validator for the timestamp field (or a custom __post_init__ / pydantic `@validator` depending on the model class) that parses the string with datetime.fromisoformat (handling "Z" → "+00:00") and either stores a normalized int or a validated datetime, and update computed_field ts_micros to assume timestamp was validated (keeping the existing conversion logic but removing parsing error paths).
222-237:⚠️ Potential issue | 🟡 MinorReject empty-string
categoryforMarkEvent.
MarkEvent.category=""currently passes, which bypasses the intendedNone/non-empty contract.✅ Proposed fix
class MarkEvent(_EventBase): @@ category_profile: dict[str, Any] | None = Field( @@ ) + + `@field_validator`("category") + `@classmethod` + def _validate_category(cls, v: str | None) -> str | None: + if v is not None and not v: + raise ValueError("category must be null/omitted or a non-empty string") + return v `@model_validator`(mode="after") def _validate_category_subtype_coherence(self) -> Self: _require_subtype_when_custom(self.category, self.category_profile) return self🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/events.py` around lines 222 - 237, The MarkEvent model currently allows category="" which should be rejected; update the model-level validator _validate_category_subtype_coherence (or add a small pre/after validator) to treat an empty-string category as invalid by raising a validation error (e.g., ValueError) when self.category == "" so empty strings no longer pass the category/non-empty contract; keep existing call to _require_subtype_when_custom for the 'custom' case and ensure the validator references category and category_profile when enforcing rules.
🧹 Nitpick comments (1)
packages/nvidia_nat_atif/tests/test_spec_compliance.py (1)
137-147: Add regression tests for invalid timestamp and emptyMarkEvent.category.These two validation edges are currently unpinned; adding explicit tests will prevent regression once the model validators are tightened.
As per coding guidelines: "Maintain >= 80% test coverage; add or update tests when introducing changes."🧪 Suggested tests
def test_envelope_timestamp_accepts_rfc3339_string() -> None: @@ assert e.timestamp == "2026-01-01T00:00:00Z" + + +def test_envelope_timestamp_rejects_invalid_string() -> None: + """§5.1: malformed timestamp strings raise ValidationError.""" + with expect_validation_error("timestamp"): + ScopeEvent(**_scope_kwargs(timestamp="not-a-timestamp")) @@ def test_mark_category_accepts_populated_value() -> None: @@ assert e.category_profile == {"model_name": "gpt-4.1"} + + +def test_mark_category_rejects_empty_string() -> None: + """§4: category, when set, must be non-empty.""" + with expect_validation_error("category"): + MarkEvent(**_mark_kwargs(category=""))Also applies to: 369-379
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/tests/test_spec_compliance.py` around lines 137 - 147, Add two regression tests: one named like test_envelope_timestamp_rejects_invalid_value that constructs ScopeEvent with an invalid timestamp (e.g., "not-a-timestamp" or an out-of-range type) and asserts that validation raises a ValidationError; and one named like test_markevent_rejects_empty_category that constructs MarkEvent with category="" and asserts validation raises a ValidationError. Place the timestamp test adjacent to the existing timestamp tests (near test_envelope_timestamp_accepts_rfc3339_string / test_envelope_timestamp_accepts_integer_microseconds) and place the MarkEvent test near other MarkEvent spec tests; reference the ScopeEvent and MarkEvent constructors and import ValidationError from pydantic (or the project’s validation exception) to perform the assertRaises checks so these edges are covered and prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_atif/examples/atof_to_atif/convert_atof_examples_to_atif.py`:
- Around line 5-7: The module docstring in convert_atof_examples_to_atif.py only
mentions EXMP-01 and EXMP-02 but the script actually converts three example
files (exmp01, exmp02, exmp03); update the top-of-file docstring to list EXMP-03
as well and mention all three example tiers (EXMP-01, EXMP-02, EXMP-03) and that
the JSONL files are produced by generate_atof_examples.py so the docstring
accurately reflects the behavior of the conversion functions in this module.
- Line 35: The public CLI entrypoint function main() lacks a docstring; add a
Google-style docstring directly above the def main() declaration that briefly
describes the command's purpose, expected behavior/side effects (e.g., files
read/written or conversion performed), any arguments (if none, state that it
takes no parameters), return value (None) and examples or usage notes if
applicable so it meets the project's docstring guidelines.
---
Duplicate comments:
In `@packages/nvidia_nat_atif/src/nat/atof/events.py`:
- Around line 146-157: The ts_micros property currently parses self.timestamp at
access time and raises on invalid strings; move this validation into the model's
validation phase so invalid timestamps fail model construction instead of at
runtime. Add a validator for the timestamp field (or a custom __post_init__ /
pydantic `@validator` depending on the model class) that parses the string with
datetime.fromisoformat (handling "Z" → "+00:00") and either stores a normalized
int or a validated datetime, and update computed_field ts_micros to assume
timestamp was validated (keeping the existing conversion logic but removing
parsing error paths).
- Around line 222-237: The MarkEvent model currently allows category="" which
should be rejected; update the model-level validator
_validate_category_subtype_coherence (or add a small pre/after validator) to
treat an empty-string category as invalid by raising a validation error (e.g.,
ValueError) when self.category == "" so empty strings no longer pass the
category/non-empty contract; keep existing call to _require_subtype_when_custom
for the 'custom' case and ensure the validator references category and
category_profile when enforcing rules.
---
Nitpick comments:
In `@packages/nvidia_nat_atif/tests/test_spec_compliance.py`:
- Around line 137-147: Add two regression tests: one named like
test_envelope_timestamp_rejects_invalid_value that constructs ScopeEvent with an
invalid timestamp (e.g., "not-a-timestamp" or an out-of-range type) and asserts
that validation raises a ValidationError; and one named like
test_markevent_rejects_empty_category that constructs MarkEvent with category=""
and asserts validation raises a ValidationError. Place the timestamp test
adjacent to the existing timestamp tests (near
test_envelope_timestamp_accepts_rfc3339_string /
test_envelope_timestamp_accepts_integer_microseconds) and place the MarkEvent
test near other MarkEvent spec tests; reference the ScopeEvent and MarkEvent
constructors and import ValidationError from pydantic (or the project’s
validation exception) to perform the assertRaises checks so these edges are
covered and prevent regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: ce4a1277-c37f-4f4f-9fe1-cb74ba7b1d1e
📒 Files selected for processing (7)
packages/nvidia_nat_atif/examples/atof_to_atif/README.mdpackages/nvidia_nat_atif/examples/atof_to_atif/convert_atof_examples_to_atif.pypackages/nvidia_nat_atif/examples/atof_to_atif/generate_atof_examples.pypackages/nvidia_nat_atif/src/nat/atof/events.pypackages/nvidia_nat_atif/src/nat/atof/io.pypackages/nvidia_nat_atif/tests/test_spec_compliance.pypackages/nvidia_nat_atif/tests/test_tier1_conversion.py
✅ Files skipped from review due to trivial changes (3)
- packages/nvidia_nat_atif/src/nat/atof/io.py
- packages/nvidia_nat_atif/examples/atof_to_atif/README.md
- packages/nvidia_nat_atif/tests/test_tier1_conversion.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/nvidia_nat_atif/examples/atof_to_atif/generate_atof_examples.py
| Reads each example JSONL file produced by ``generate_atof_examples.py`` | ||
| (EXMP-01 tier-1, EXMP-02 tier-2) and writes the resulting ATIF trajectory | ||
| as formatted JSON. |
There was a problem hiding this comment.
Update the module docstring to include EXMP-03.
The code converts three examples (exmp01, exmp02, exmp03), but the docstring text only documents two.
📝 Suggested doc fix
-Reads each example JSONL file produced by ``generate_atof_examples.py``
-(EXMP-01 tier-1, EXMP-02 tier-2) and writes the resulting ATIF trajectory
+Reads each example JSONL file produced by ``generate_atof_examples.py``
+(EXMP-01 tier-1, EXMP-02 tier-2, EXMP-03 tier-3) and writes the resulting ATIF trajectory
as formatted JSON.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Reads each example JSONL file produced by ``generate_atof_examples.py`` | |
| (EXMP-01 tier-1, EXMP-02 tier-2) and writes the resulting ATIF trajectory | |
| as formatted JSON. | |
| Reads each example JSONL file produced by ``generate_atof_examples.py`` | |
| (EXMP-01 tier-1, EXMP-02 tier-2, EXMP-03 tier-3) and writes the resulting ATIF trajectory | |
| as formatted JSON. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/nvidia_nat_atif/examples/atof_to_atif/convert_atof_examples_to_atif.py`
around lines 5 - 7, The module docstring in convert_atof_examples_to_atif.py
only mentions EXMP-01 and EXMP-02 but the script actually converts three example
files (exmp01, exmp02, exmp03); update the top-of-file docstring to list EXMP-03
as well and mention all three example tiers (EXMP-01, EXMP-02, EXMP-03) and that
the JSONL files are produced by generate_atof_examples.py so the docstring
accurately reflects the behavior of the conversion functions in this module.
| ] | ||
|
|
||
|
|
||
| def main() -> None: |
There was a problem hiding this comment.
Add a docstring for main() (public CLI entrypoint).
main() is public and currently undocumented.
📝 Suggested update
def main() -> None:
+ """Convert bundled ATOF example streams into ATIF JSON outputs."""
parser = argparse.ArgumentParser(description="Convert ATOF v0.1 JSONL to ATIF JSON")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/nvidia_nat_atif/examples/atof_to_atif/convert_atof_examples_to_atif.py`
at line 35, The public CLI entrypoint function main() lacks a docstring; add a
Google-style docstring directly above the def main() declaration that briefly
describes the command's purpose, expected behavior/side effects (e.g., files
read/written or conversion performed), any arguments (if none, state that it
takes no parameters), return value (None) and examples or usage notes if
applicable so it meets the project's docstring guidelines.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py (2)
81-89:⚠️ Potential issue | 🟠 MajorDon't index
tc["id"]directly.A malformed or producer-minimal tool call will raise
KeyErrorhere and abort the whole conversion. Use.get("id")with a deterministic fallback so each emittedToolCallstill has a stable identifier.Suggested fix
if "function" in tc and isinstance(tc["function"], dict): inner = tc["function"] - tool_id = tc["id"] + tool_id = tc.get("id") or f"tool-call-{len(result) + 1}" name = inner.get("name", "") args = inner.get("arguments", {}) else: - tool_id = tc["id"] + tool_id = tc.get("id") or f"tool-call-{len(result) + 1}" name = tc.get("name", "") args = tc.get("arguments", {})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 81 - 89, The code currently accesses tc["id"] directly which can raise KeyError for malformed tool calls; change both assignments of tool_id to use tc.get("id", "<missing-tool-id>") (or another deterministic fallback string) instead of tc["id"] so every emitted ToolCall has a stable identifier; update the two places in the branch that set tool_id (the branch using inner = tc["function"] and the else branch) to use tc.get("id", "<missing-tool-id>") and ensure any downstream logic expects that fallback value.
711-757:⚠️ Potential issue | 🟠 MajorAvoid inventing fallback agent/session metadata.
When the source stream omits metadata, this still stamps the output with
Agent.version="1.0.0"and sometimessession_id="atof-session". That fabricates provenance and can collide across unrelated conversions. Prefer an explicit sentinel like"unknown"for the agent version and letTrajectorygenerate a UUID when neither metadata norroot_agent_uuidis available.Suggested fix
- agent_version: str = "1.0.0" + agent_version: str = "unknown" @@ - if session_id is None: - session_id = root_agent_uuid or "atof-session" + if session_id is None: + session_id = root_agent_uuid @@ - return Trajectory( - schema_version="ATIF-v1.7", - session_id=session_id, - agent=Agent(name=agent_name, version=agent_version, model_name=model_name), - steps=steps, - subagent_trajectories=subagent_trajectories or None, - ) + trajectory_kwargs = { + "schema_version": "ATIF-v1.7", + "agent": Agent(name=agent_name, version=agent_version, model_name=model_name), + "steps": steps, + "subagent_trajectories": subagent_trajectories or None, + } + if session_id is not None: + trajectory_kwargs["session_id"] = session_id + return Trajectory(**trajectory_kwargs)Also applies to: 774-777
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 711 - 757, The code currently fabricates provenance by defaulting agent_version to "1.0.0" and forcing session_id to "atof-session"; change it so missing metadata is represented explicitly as a sentinel and session IDs are left unset so Trajectory can generate a UUID: set agent_version default to "unknown" instead of "1.0.0", do not assign session_id to root_agent_uuid or "atof-session" when session metadata is absent (leave session_id as None), and ensure the same behavior is applied in the alternative branch that sets these values (references: agent_version, session_id, root_agent_uuid, agent_name, and the blocks that inspect event.metadata).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/nvidia_nat_atif/src/nat/atif/function_ancestry.py`:
- Around line 41-48: The model allows parent_name when parent_id is None; add a
Pydantic validation on the model (use a `@root_validator` or a validator for
"parent_name") in function_ancestry.py to enforce that parent_name must be
None/unset if parent_id is None and raise a ValidationError otherwise; reference
the fields parent_id and parent_name and update the validator error message to
clearly state "parent_name may only be set when parent_id is present."
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 136-149: _extract_agent_message currently drops multimodal
assistant output because it only returns strings; if data["content"] or the
extracted choice content is a list (multimodal ContentPart[]), it returns "" and
the ATIF Step loses that output. Change _extract_agent_message to return
Union[str, list] (or a type that allows ContentPart[]), update its type hint,
and when content is a list simply return it unchanged; likewise, when pulling
from data["choices"][0]["message"]["content"], if that value is a list return it
instead of filtering to strings. Ensure callers that assign to Step.message
(which accepts ContentPart[]) accept the returned list.
- Around line 674-705: The loop over subagent_roots is recursing into roots that
have already been included in a previously extracted descendant set, causing
nested subagents to be emitted twice; before processing each root in the for
root in subagent_roots loop check if id(root) is in excluded_ids and skip
(continue) if so so you don't call _collect_descendants or _convert_impl for
already-covered roots, preserving excluded_ids, subagent_trajectories,
subagent_ref_by_tc_id and subagent_ref_by_context_uuid behavior.
---
Duplicate comments:
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 81-89: The code currently accesses tc["id"] directly which can
raise KeyError for malformed tool calls; change both assignments of tool_id to
use tc.get("id", "<missing-tool-id>") (or another deterministic fallback string)
instead of tc["id"] so every emitted ToolCall has a stable identifier; update
the two places in the branch that set tool_id (the branch using inner =
tc["function"] and the else branch) to use tc.get("id", "<missing-tool-id>") and
ensure any downstream logic expects that fallback value.
- Around line 711-757: The code currently fabricates provenance by defaulting
agent_version to "1.0.0" and forcing session_id to "atof-session"; change it so
missing metadata is represented explicitly as a sentinel and session IDs are
left unset so Trajectory can generate a UUID: set agent_version default to
"unknown" instead of "1.0.0", do not assign session_id to root_agent_uuid or
"atof-session" when session metadata is absent (leave session_id as None), and
ensure the same behavior is applied in the alternative branch that sets these
values (references: agent_version, session_id, root_agent_uuid, agent_name, and
the blocks that inspect event.metadata).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 9a96a9d5-e50a-4de4-b52a-83021ed76ed2
📒 Files selected for processing (7)
packages/nvidia_nat_atif/src/nat/atif/__init__.pypackages/nvidia_nat_atif/src/nat/atif/function_ancestry.pypackages/nvidia_nat_atif/src/nat/atif/step.pypackages/nvidia_nat_atif/src/nat/atif/tool_call.pypackages/nvidia_nat_atif/src/nat/atif/trajectory.pypackages/nvidia_nat_atif/src/nat/atof/io.pypackages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py
✅ Files skipped from review due to trivial changes (2)
- packages/nvidia_nat_atif/src/nat/atof/io.py
- packages/nvidia_nat_atif/src/nat/atif/init.py
| parent_id: str | None = Field( | ||
| default=None, | ||
| description="Unique identifier of the parent callable node; null for root-level", | ||
| ) | ||
| parent_name: str | None = Field( | ||
| default=None, | ||
| description="Human-readable name of the parent callable; null when parent_id is null", | ||
| ) |
There was a problem hiding this comment.
Reject orphan parent_name values.
This model currently accepts parent_name when parent_id is None, which contradicts the field description and leaves the ancestry edge in an impossible state. Please enforce that parent_name is unset unless parent_id is present.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/src/nat/atif/function_ancestry.py` around lines 41 -
48, The model allows parent_name when parent_id is None; add a Pydantic
validation on the model (use a `@root_validator` or a validator for "parent_name")
in function_ancestry.py to enforce that parent_name must be None/unset if
parent_id is None and raise a ValidationError otherwise; reference the fields
parent_id and parent_name and update the validator error message to clearly
state "parent_name may only be set when parent_id is present."
| for root in subagent_roots: | ||
| descendants = _collect_descendants(root.uuid, events, parent_map) | ||
| for e in descendants: | ||
| excluded_ids.add(id(e)) | ||
|
|
||
| child_trajectory = _convert_impl(descendants, explicit_root_uuid=root.uuid) | ||
| subagent_trajectories.append(child_trajectory) | ||
|
|
||
| # Correlate the child trajectory with its wrapping dispatcher scope so | ||
| # the main pass can attach subagent_trajectory_ref to the right | ||
| # observation. ``tool`` wrappers correlate via tool_call_id (R7); | ||
| # ``context`` wrappers correlate via the wrapping scope's UUID (R10). | ||
| wrapping_uuid = root.parent_uuid | ||
| wrapping_category = None | ||
| wrapping_tc_id = None | ||
| if wrapping_uuid is not None: | ||
| for e in events: | ||
| if ( | ||
| _is_scope_start(e) | ||
| and isinstance(e, ScopeEvent) | ||
| and e.uuid == wrapping_uuid | ||
| ): | ||
| wrapping_category = e.category | ||
| if e.category == "tool": | ||
| wrapping_tc_id = (e.category_profile or {}).get("tool_call_id") | ||
| break | ||
|
|
||
| ref = {"session_id": child_trajectory.session_id, "trajectory_path": None} | ||
| if wrapping_category == "tool" and wrapping_tc_id: | ||
| subagent_ref_by_tc_id[wrapping_tc_id] = ref | ||
| elif wrapping_category == "context" and wrapping_uuid: | ||
| subagent_ref_by_context_uuid[wrapping_uuid] = ref |
There was a problem hiding this comment.
Skip nested subagent roots after the outer one is partitioned.
This loop converts every entry in subagent_roots, including roots already contained in a previously extracted descendant set. For nested delegation, the inner subagent gets emitted once inside its parent trajectory and then again as a sibling at the current level. Skip roots already covered by excluded_ids before recursing.
Suggested fix
for root in subagent_roots:
+ if id(root) in excluded_ids:
+ continue
descendants = _collect_descendants(root.uuid, events, parent_map)
for e in descendants:
excluded_ids.add(id(e))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`
around lines 674 - 705, The loop over subagent_roots is recursing into roots
that have already been included in a previously extracted descendant set,
causing nested subagents to be emitted twice; before processing each root in the
for root in subagent_roots loop check if id(root) is in excluded_ids and skip
(continue) if so so you don't call _collect_descendants or _convert_impl for
already-covered roots, preserving excluded_ids, subagent_trajectories,
subagent_ref_by_tc_id and subagent_ref_by_context_uuid behavior.
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
There was a problem hiding this comment.
♻️ Duplicate comments (4)
packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py (4)
799-801:⚠️ Potential issue | 🟡 MinorAvoid hard-coding
agent_version="1.0.0"as the default.The fallback version should be a sentinel like
"unknown"rather than a semver string. As per coding guidelines: "Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs."🔧 Suggested fix
agent_name: str | None = None - agent_version: str = "1.0.0" + agent_version: str = "unknown" model_name: str | None = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 799 - 801, The default for agent_version currently hard-codes "1.0.0"; change this to a sentinel like "unknown" to avoid embedding a semver string in code and to respect setuptools-scm-driven versioning—update the declaration of agent_version in atof_to_atif_converter.py (the variable named agent_version) to use "unknown" (or None if you prefer explicit absence) as the default and ensure any downstream logic that reads agent_version treats this sentinel appropriately.
190-209:⚠️ Potential issue | 🟠 MajorMultimodal assistant content is dropped.
When
data["content"]or the extracted choice content is a list (multimodalContentPart[]), the function returns"". SinceStep.messageacceptsstr | list[ContentPart], list content should be passed through.🔧 Suggested fix to preserve multimodal content
-def _extract_agent_message(data: dict | None) -> str: +def _extract_agent_message(data: dict | None) -> str | list: """Return the assistant-turn content string from an LLM-end payload (R4). - Returns ``""`` when no content path matches. The caller in the dispatch + Returns ``""`` when no content path matches (or the list for multimodal). + The caller in the dispatch loop combines this with the tool-call extraction result and raises :class:`ShapeMismatchError` only when BOTH are empty on non-empty data (a payload with only tool_calls or only content is legitimate). """ if not isinstance(data, dict): return "" content = data.get("content") if isinstance(content, str): return content + if isinstance(content, list): + return content try: choice_content = data["choices"][0]["message"].get("content", "") if isinstance(choice_content, str): return choice_content + if isinstance(choice_content, list): + return choice_content except (KeyError, IndexError, TypeError): pass return ""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 190 - 209, The function _extract_agent_message currently drops multimodal content because it only returns strings; update it to preserve and return list content when data.get("content") or the extracted choice_content is a list (ContentPart[]), change the return type hint from str to allow str | list[ContentPart] (or list[dict] if ContentPart isn't imported), and adjust the docstring accordingly so callers like Step.message (which accepts str | list[ContentPart]) receive the list unchanged; keep existing fallback behavior (return "" for non-dict or unexpected types) and maintain the same exception handling around the choices extraction.
763-769:⚠️ Potential issue | 🟠 MajorSkip nested subagent roots already covered by
excluded_ids.When processing subagent roots, inner roots that are descendants of a previously processed root will already be in
excluded_ids. Processing them again results in duplicate child trajectories.🔧 Suggested fix
for root in subagent_roots: + if id(root) in excluded_ids: + continue descendants = _collect_descendants(root.uuid, events, parent_map) for e in descendants: excluded_ids.add(id(e))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 763 - 769, The loop over subagent_roots should skip any root already covered in excluded_ids to avoid duplicate child trajectories: before calling _collect_descendants/_convert_impl for a given root in the for root in subagent_roots loop, check whether the root is in excluded_ids (use the same identity scheme used when adding entries, e.g., id(root)), and continue to the next root if present; keep using _collect_descendants(root.uuid, events, parent_map) and then add ids of its descendants to excluded_ids and call _convert_impl(descendants, explicit_root_uuid=root.uuid) only for non-excluded roots so nested subagent roots are not processed twice.
129-137:⚠️ Potential issue | 🟡 MinorPotential
KeyErrorif tool call lacksidfield.Both branches access
tc["id"]directly without a fallback. If a malformed tool call lacks anidfield, this raisesKeyError.🛡️ Suggested defensive fix
if "function" in tc and isinstance(tc["function"], dict): inner = tc["function"] - tool_id = tc["id"] + tool_id = tc.get("id", "") name = inner.get("name", "") args = inner.get("arguments", {}) else: - tool_id = tc["id"] + tool_id = tc.get("id", "") name = tc.get("name", "") args = tc.get("arguments", {})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 129 - 137, The code reads tc["id"] in both branches and can KeyError if "id" is missing; update the assignments for tool_id in both the inner-function branch (where inner = tc["function"]) and the else branch to use a safe lookup like tc.get("id", None) (or a default/validated value), and add a simple validation/early error message if tool_id is None/empty so malformed tool calls are handled gracefully rather than raising KeyError.
🧹 Nitpick comments (1)
packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py (1)
90-98: Consider returningdict[str, Any]for clarity.The return type
dictis unparameterized. For internal helpers this is acceptable, but adding the type parameters improves clarity.♻️ Optional type hint refinement
-def _build_ancestry(uuid: str, name: str, parent_uuid: str | None, name_map: dict[str, str]) -> dict: +def _build_ancestry(uuid: str, name: str, parent_uuid: str | None, name_map: dict[str, str]) -> dict[str, str | None]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py` around lines 90 - 98, Update the return type of the helper _build_ancestry to be explicit (dict[str, Any]) to clarify the heterogeneous values; change the signature from -> dict to -> dict[str, Any] and import Any from typing at the top of the module, keeping the function body unchanged and preserving existing parameter types (uuid, name, parent_uuid, name_map) and returned keys ("function_id", "function_name", "parent_id", "parent_name").
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 799-801: The default for agent_version currently hard-codes
"1.0.0"; change this to a sentinel like "unknown" to avoid embedding a semver
string in code and to respect setuptools-scm-driven versioning—update the
declaration of agent_version in atof_to_atif_converter.py (the variable named
agent_version) to use "unknown" (or None if you prefer explicit absence) as the
default and ensure any downstream logic that reads agent_version treats this
sentinel appropriately.
- Around line 190-209: The function _extract_agent_message currently drops
multimodal content because it only returns strings; update it to preserve and
return list content when data.get("content") or the extracted choice_content is
a list (ContentPart[]), change the return type hint from str to allow str |
list[ContentPart] (or list[dict] if ContentPart isn't imported), and adjust the
docstring accordingly so callers like Step.message (which accepts str |
list[ContentPart]) receive the list unchanged; keep existing fallback behavior
(return "" for non-dict or unexpected types) and maintain the same exception
handling around the choices extraction.
- Around line 763-769: The loop over subagent_roots should skip any root already
covered in excluded_ids to avoid duplicate child trajectories: before calling
_collect_descendants/_convert_impl for a given root in the for root in
subagent_roots loop, check whether the root is in excluded_ids (use the same
identity scheme used when adding entries, e.g., id(root)), and continue to the
next root if present; keep using _collect_descendants(root.uuid, events,
parent_map) and then add ids of its descendants to excluded_ids and call
_convert_impl(descendants, explicit_root_uuid=root.uuid) only for non-excluded
roots so nested subagent roots are not processed twice.
- Around line 129-137: The code reads tc["id"] in both branches and can KeyError
if "id" is missing; update the assignments for tool_id in both the
inner-function branch (where inner = tc["function"]) and the else branch to use
a safe lookup like tc.get("id", None) (or a default/validated value), and add a
simple validation/early error message if tool_id is None/empty so malformed tool
calls are handled gracefully rather than raising KeyError.
---
Nitpick comments:
In `@packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.py`:
- Around line 90-98: Update the return type of the helper _build_ancestry to be
explicit (dict[str, Any]) to clarify the heterogeneous values; change the
signature from -> dict to -> dict[str, Any] and import Any from typing at the
top of the module, keeping the function body unchanged and preserving existing
parameter types (uuid, name, parent_uuid, name_map) and returned keys
("function_id", "function_name", "parent_id", "parent_name").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e81e5537-4746-4384-a8b5-98ad4508f3f7
📒 Files selected for processing (2)
packages/nvidia_nat_atif/src/nat/atof/scripts/atof_to_atif_converter.pypackages/nvidia_nat_atif/tests/test_shape_mismatch.py
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Description
Closes
By Submitting this PR I confirm:
Summary by CodeRabbit
New Features
Documentation
Chores
Tests
ATIF Compatibility