[core] Improve the toolbox & dsl by kpouget · Pull Request #29 · openshift-psap/forge

kpouget · 2026-04-15T15:11:01Z

Summary by CodeRabbit

Release Notes

Bug Fixes
- Task execution now fails immediately when no tasks are found for a file, providing faster feedback.
- Shell command output is now consistently captured and written to specified destinations.
Refactor
- Improved task and execution state management through centralized registry system.
- Updated logging configuration for clearer, consistent output formatting across DSL components.

…anagement

openshift-ci · 2026-04-15T15:11:06Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign avasilevskii for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-04-15T15:15:59Z

📝 Walkthrough

Walkthrough

This change refactors the DSL task management system from relying on global mutable state to using a centralized ScriptManager class. Task registration, execution tracking, and results are now managed through a manager instance rather than module-level registries. Logging is standardized across DSL modules to use a dedicated 'DSL' logger.

Changes

Cohort / File(s)	Summary
Script Manager Introduction `projects/core/dsl/script_manager.py`	New module introducing `TaskResult` and `ScriptManager` classes to manage task registry and execution state per source file, replacing global mutable structures. Provides `get_script_manager()` and `reset_script_manager()` for centralized access.
Task Management Refactoring `projects/core/dsl/task.py`, `projects/core/dsl/runtime.py`	Migrated from global registries to `ScriptManager`-based task tracking. Task decorator now calls `script_manager.register_task()` and retrieves result containers from the manager. Runtime execution changed to fetch tasks via `script_manager.get_tasks_from_file()` and handle "always" tasks within the same execution flow.
Logging Standardization `projects/core/dsl/log.py`, `projects/core/dsl/runner.py`, `projects/core/dsl/shell.py`, `projects/core/dsl/toolbox.py`	Updated logger initialization across modules to use `'DSL'` logger name with propagation disabled. Added `setup_clean_logger()` helper in `log.py` for clean message-only formatting.
API Exposure and Shell Updates `projects/core/dsl/__init__.py`, `projects/core/dsl/shell.py`	Updated `__all__` to export `get_script_manager`, `reset_script_manager`, and `toolbox`. Removed `capture_output` parameter from `shell.run()` (always captures); adjusted stdout writing logic to write when destination is provided regardless of content.

Sequence Diagram

sequenceDiagram
    participant TaskDec as Task Decorator
    participant ScriptMgr as ScriptManager
    participant Runtime as Runtime Executor
    participant Shell as Shell Executor
    participant Logger as DSL Logger

    TaskDec->>ScriptMgr: register_task(task_info, source_file)
    ScriptMgr->>ScriptMgr: Store task in _task_registry
    ScriptMgr->>ScriptMgr: Create TaskResult for task
    TaskDec->>TaskDec: Assign result container to wrapper.status

    Runtime->>ScriptMgr: get_tasks_from_file(rel_filename)
    ScriptMgr-->>Runtime: Return list of tasks for file
    
    Runtime->>Runtime: Execute normal phase tasks
    Runtime->>Shell: Execute each task
    Shell->>Shell: Run subprocess with capture_output=True
    Shell->>Logger: Log output if enabled
    Shell-->>Runtime: Return execution result
    
    Runtime->>ScriptMgr: get_task_result(task_name)
    ScriptMgr-->>Runtime: Return TaskResult
    Runtime->>ScriptMgr: _set_result(result)
    
    Runtime->>Runtime: Execute always phase tasks
    Runtime->>Shell: Execute each always task
    Shell-->>Runtime: Return execution result
    Runtime->>ScriptMgr: _set_result(result)
    
    Runtime->>ScriptMgr: clear_tasks(file_path)
    ScriptMgr->>ScriptMgr: Clear tasks and results for file

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

A hop, a skip, a manager's grace,
No globals now in this new place, 🐰
Tasks are tracked with DSL's name,
Registry refactored, logs aflame,
Clean and bright, a scripted dream! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title is vague and generic, using non-descriptive terms like 'Improve' that don't convey the specific nature of the substantial architectural changes made to the DSL system.	Consider a more specific title that reflects the main change, such as '[core] Refactor DSL task management to use centralized ScriptManager' or '[core] Move DSL task registry from global state to ScriptManager'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 85.19% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

projects/core/dsl/task.py (1)

180-184: ⚠️ Potential issue | 🔴 Critical

Register tasks with the same file key that runtime uses.

task() stores tasks under os.path.relpath(definition_filename), but execute_tasks() looks them up with Path(filename).relative_to(env.FORGE_HOME) in projects/core/dsl/runtime.py Lines 69-73. When the process CWD is not FORGE_HOME, those keys diverge and Line 105 in projects/core/dsl/runtime.py raises No tasks found for file ....

📍 Suggested normalization

-    # Make filename relative to current working directory
-    try:
-        rel_definition_filename = os.path.relpath(definition_filename)
-    except ValueError:
-        rel_definition_filename = definition_filename
+    # Make filename relative to FORGE_HOME to match runtime lookup
+    try:
+        rel_definition_filename = str(Path(definition_filename).relative_to(env.FORGE_HOME))
+    except ValueError:
+        rel_definition_filename = definition_filename

Also add from pathlib import Path near the imports.

Also applies to: 223-224

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@projects/core/dsl/task.py` around lines 180 - 184, task registration uses
os.path.relpath(definition_filename) which doesn't match execute_tasks() lookup
(Path(filename).relative_to(env.FORGE_HOME)), causing missing-task errors;
change task() to normalize the key using
Path(definition_filename).relative_to(env.FORGE_HOME) (wrapped in try/except to
fall back to Path(definition_filename).resolve().relative_to(Path.cwd()) or the
original string if relative_to fails) and convert to str before storing, and
make the same change for the second occurrence around lines 223-224; also add
"from pathlib import Path" to the imports so Path is available.

projects/core/dsl/shell.py (1)

27-34: ⚠️ Potential issue | 🟠 Major

Don't force every command through in-memory capture.

Hardcoding capture_output=True means every command now buffers stdout/stderr in RAM, including the stdout_dest case where the payload is written to disk afterward. That removes the previous escape hatch for large or streaming commands and can turn artifact captures into a memory spike.

♻️ Suggested fix

 def run(
         command: str,
         check: bool = True,
+        capture_output: bool = True,
         shell: bool = True,
         stdout_dest: Optional[Union[str, Path]] = None,
         log_stdout: bool = True,
         log_stderr: bool = True,
 ) -> CommandResult:
@@
         result = subprocess.run(
             command,
             shell=shell,
             check=False,  # We handle check ourselves
-            capture_output=True,
+            capture_output=capture_output,
             text=True
         )
+
+        stdout = result.stdout or ""
+        stderr = result.stderr or ""
 
         cmd_result = CommandResult(
-            stdout=result.stdout or "",
-            stderr=result.stderr or "",
+            stdout=stdout,
+            stderr=stderr,
             returncode=result.returncode,
             command=command
         )
@@
-        if result.stdout and result.stdout.strip():
+        if stdout.strip():
             if stdout_dest:
                 logger.info(f"| <stdout saved into {stdout_dest}>")
             elif log_stdout:
-                logger.info(f"| <stdout> {result.stdout.strip()}")
+                logger.info(f"| <stdout> {stdout.strip()}")
             else:
                 logger.info("| <stdout logging skipped>")
 
-        if result.stderr and result.stderr.strip():
+        if stderr.strip():
             if log_stderr:
-                logger.info(f"| <stderr> {result.stderr.strip()}")
+                logger.info(f"| <stderr> {stderr.strip()}")
             else:
                 logger.info("| <stderr logging skipped>")
 
-        if not (result.stdout.strip() or result.stderr.strip()):
+        if not (stdout.strip() or stderr.strip()):
             logger.info("| <no output>")

If stdout_dest is the main large-output path, an even better follow-up is to stream directly to the file instead of capturing first.

Also applies to: 53-90

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@projects/core/dsl/shell.py` around lines 27 - 34, The run function currently
forces capture_output=True which buffers all stdout/stderr in memory; change it
to decide how to wire subprocess stdout/stderr based on the parameters: if
stdout_dest is provided, open that file and pass the file handle as stdout to
subprocess (and do not set capture_output), otherwise if log_stdout or
log_stderr is true use subprocess.PIPE to capture those streams; ensure stderr
is similarly routed (PIPE or file/None) rather than always capturing both.
Update the logic in run (and the corresponding block around lines 53-90) to set
stdout/stderr args dynamically and close any opened file handles; consider
recommending a streaming path (directly writing to stdout_dest) as a follow-up
for very large outputs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/core/dsl/runtime.py`:
- Around line 100-114: The first execution loop should skip tasks marked with
always_execute and only run non-always tasks: in the loop that pops from
file_tasks (using get_script_manager().get_tasks_from_file and variable
current_task_info), if current_task_info.always_execute is True then defer it
(e.g., append to a separate always_tasks list) instead of executing it; call
_execute_single_task(current_task_info, args, shared_context) only for
non-always tasks. On failure, the recovery loop must iterate over that
always_tasks list (not all remaining file_tasks) and call
_execute_single_task(task_info, args, shared_context) for each, ensuring the
loop variable is task_info (not the undefined name `task`) so no NameError
occurs.

In `@projects/core/dsl/script_manager.py`:
- Around line 41-43: The _task_results dict currently keys only by task_name
causing collisions across scripts; change the keying to a file-scoped composite
key (e.g. f"{source_file}:{task_name}" or a tuple (source_file, task_name))
wherever TaskResult entries are created, accessed, or deleted: update the
registration logic that populates self._task_results (refer to _task_registry
and the method that registers tasks), update lookups that read task
results/status (wherever _task_results[task_name] is used), and update
clear_tasks(source_file) to remove only entries whose composite key matches the
given source_file; ensure TaskResult creation, updates, and any
status.return_value references use the new composite key consistently to avoid
cross-file replacement.

---

Outside diff comments:
In `@projects/core/dsl/shell.py`:
- Around line 27-34: The run function currently forces capture_output=True which
buffers all stdout/stderr in memory; change it to decide how to wire subprocess
stdout/stderr based on the parameters: if stdout_dest is provided, open that
file and pass the file handle as stdout to subprocess (and do not set
capture_output), otherwise if log_stdout or log_stderr is true use
subprocess.PIPE to capture those streams; ensure stderr is similarly routed
(PIPE or file/None) rather than always capturing both. Update the logic in run
(and the corresponding block around lines 53-90) to set stdout/stderr args
dynamically and close any opened file handles; consider recommending a streaming
path (directly writing to stdout_dest) as a follow-up for very large outputs.

In `@projects/core/dsl/task.py`:
- Around line 180-184: task registration uses
os.path.relpath(definition_filename) which doesn't match execute_tasks() lookup
(Path(filename).relative_to(env.FORGE_HOME)), causing missing-task errors;
change task() to normalize the key using
Path(definition_filename).relative_to(env.FORGE_HOME) (wrapped in try/except to
fall back to Path(definition_filename).resolve().relative_to(Path.cwd()) or the
original string if relative_to fails) and convert to str before storing, and
make the same change for the second occurrence around lines 223-224; also add
"from pathlib import Path" to the imports so Path is available.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5bf2d60a-9767-484a-b5a3-0cfbca2ebe84

📥 Commits

Reviewing files that changed from the base of the PR and between 9026197 and f3eadfc.

📒 Files selected for processing (8)

projects/core/dsl/__init__.py
projects/core/dsl/log.py
projects/core/dsl/runner.py
projects/core/dsl/runtime.py
projects/core/dsl/script_manager.py
projects/core/dsl/shell.py
projects/core/dsl/task.py
projects/core/dsl/toolbox.py

coderabbitai · 2026-04-15T15:22:04Z

+            # Execute tasks only from the calling file
+            script_manager = get_script_manager()
+            file_tasks = list(script_manager.get_tasks_from_file(rel_filename))
+
+            if not file_tasks:
+                logger.error(f"No tasks found for file: {rel_filename}")
+                log_completion_banner(function_args, status="NO_TASKS")
+                raise RuntimeError(f"No tasks found for file: {rel_filename}")

            execution_error = None

-            # Execute normal tasks
            try:
-                for task_info in normal_tasks:
-                    _execute_single_task(task_info, args, shared_context)
+                while file_tasks:
+                    current_task_info = file_tasks.pop(0)
+                    _execute_single_task(current_task_info, args, shared_context)


⚠️ Potential issue | 🔴 Critical

@always handling is broken in both execution phases.

The first loop never checks always_execute, so @always tasks run like normal tasks. After a failure, the second loop then runs every remaining task as if it were @always, and Line 139 calls _execute_single_task(task_info, ...) with an undefined name. A normal task failure will either execute non-@always tasks unexpectedly or crash with NameError.

🛠️ Suggested structure

- file_tasks = list(script_manager.get_tasks_from_file(rel_filename)) - - if not file_tasks: + registered_tasks = list(script_manager.get_tasks_from_file(rel_filename)) + + if not registered_tasks: logger.error(f"No tasks found for file: {rel_filename}") log_completion_banner(function_args, status="NO_TASKS") raise RuntimeError(f"No tasks found for file: {rel_filename}") + + file_tasks = [task for task in registered_tasks if not task.get('always_execute')] + always_tasks = [task for task in registered_tasks if task.get('always_execute')] @@ - if file_tasks: - logging.warning("Executing the `@always` tasks ...") - while file_tasks: + if always_tasks: + logger.warning("Executing the `@always` tasks ...") + while always_tasks: try: - current_task_info = file_tasks.pop(0) - _execute_single_task(task_info, args, shared_context) + current_task_info = always_tasks.pop(0) + _execute_single_task(current_task_info, args, shared_context)

Also applies to: 133-140

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@projects/core/dsl/runtime.py` around lines 100 - 114, The first execution loop should skip tasks marked with always_execute and only run non-always tasks: in the loop that pops from file_tasks (using get_script_manager().get_tasks_from_file and variable current_task_info), if current_task_info.always_execute is True then defer it (e.g., append to a separate always_tasks list) instead of executing it; call _execute_single_task(current_task_info, args, shared_context) only for non-always tasks. On failure, the recovery loop must iterate over that always_tasks list (not all remaining file_tasks) and call _execute_single_task(task_info, args, shared_context) for each, ensuring the loop variable is task_info (not the undefined name `task`) so no NameError occurs.

coderabbitai · 2026-04-15T15:22:04Z

+        self._task_registry: Dict[str, List[dict]] = {}
+        # Task results organized by task name
+        self._task_results: Dict[str, TaskResult] = {}


⚠️ Potential issue | 🟠 Major

Task results need a file-scoped key, not just task_name.

_task_results[task_name] collides as soon as two scripts define the same task function name. The later registration replaces the earlier TaskResult, so some_task.status.return_value can point at the wrong execution, and clear_tasks(source_file) can delete another file's status object.

Also applies to: 45-60, 81-107

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@projects/core/dsl/script_manager.py` around lines 41 - 43, The _task_results dict currently keys only by task_name causing collisions across scripts; change the keying to a file-scoped composite key (e.g. f"{source_file}:{task_name}" or a tuple (source_file, task_name)) wherever TaskResult entries are created, accessed, or deleted: update the registration logic that populates self._task_results (refer to _task_registry and the method that registers tasks), update lookups that read task results/status (wherever _task_results[task_name] is used), and update clear_tasks(source_file) to remove only entries whose composite key matches the given source_file; ensure TaskResult creation, updates, and any status.return_value references use the new composite key consistently to avoid cross-file replacement.

kpouget · 2026-04-16T07:51:56Z

merging this before working on ruff
I'll finalize this in a subsequent PR

kpouget added 3 commits April 15, 2026 17:10

[core] dsl: shell: always run with capture_output=True

22df251

[core] dsl: use the DSL logger

2972c42

[core] dsl: introduce a script_manager to help with the task object m…

f3eadfc

…anagement

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

kpouget merged commit 53eb5a2 into openshift-psap:main Apr 16, 2026
3 of 4 checks passed

kpouget deleted the core branch April 16, 2026 07:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Improve the toolbox & dsl#29

[core] Improve the toolbox & dsl#29
kpouget merged 3 commits intoopenshift-psap:mainfrom
kpouget:core

kpouget commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Apr 15, 2026

Uh oh!

coderabbitai bot commented Apr 15, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

❌ Failed checks (1 inconclusive)

Review ran into problems

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 15, 2026

Uh oh!

coderabbitai bot Apr 15, 2026

Uh oh!

kpouget commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kpouget commented Apr 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-ci bot commented Apr 15, 2026

Uh oh!

coderabbitai bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

❌ Failed checks (1 inconclusive)

Review ran into problems

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

kpouget commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kpouget commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 15, 2026 •

edited

Loading