Skip to content

[core] Improve the toolbox & dsl#29

Merged
kpouget merged 3 commits intoopenshift-psap:mainfrom
kpouget:core
Apr 16, 2026
Merged

[core] Improve the toolbox & dsl#29
kpouget merged 3 commits intoopenshift-psap:mainfrom
kpouget:core

Conversation

@kpouget
Copy link
Copy Markdown
Contributor

@kpouget kpouget commented Apr 15, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Task execution now fails immediately when no tasks are found for a file, providing faster feedback.
    • Shell command output is now consistently captured and written to specified destinations.
  • Refactor

    • Improved task and execution state management through centralized registry system.
    • Updated logging configuration for clearer, consistent output formatting across DSL components.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign avasilevskii for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

This change refactors the DSL task management system from relying on global mutable state to using a centralized ScriptManager class. Task registration, execution tracking, and results are now managed through a manager instance rather than module-level registries. Logging is standardized across DSL modules to use a dedicated 'DSL' logger.

Changes

Cohort / File(s) Summary
Script Manager Introduction
projects/core/dsl/script_manager.py
New module introducing TaskResult and ScriptManager classes to manage task registry and execution state per source file, replacing global mutable structures. Provides get_script_manager() and reset_script_manager() for centralized access.
Task Management Refactoring
projects/core/dsl/task.py, projects/core/dsl/runtime.py
Migrated from global registries to ScriptManager-based task tracking. Task decorator now calls script_manager.register_task() and retrieves result containers from the manager. Runtime execution changed to fetch tasks via script_manager.get_tasks_from_file() and handle "always" tasks within the same execution flow.
Logging Standardization
projects/core/dsl/log.py, projects/core/dsl/runner.py, projects/core/dsl/shell.py, projects/core/dsl/toolbox.py
Updated logger initialization across modules to use 'DSL' logger name with propagation disabled. Added setup_clean_logger() helper in log.py for clean message-only formatting.
API Exposure and Shell Updates
projects/core/dsl/__init__.py, projects/core/dsl/shell.py
Updated __all__ to export get_script_manager, reset_script_manager, and toolbox. Removed capture_output parameter from shell.run() (always captures); adjusted stdout writing logic to write when destination is provided regardless of content.

Sequence Diagram

sequenceDiagram
    participant TaskDec as Task Decorator
    participant ScriptMgr as ScriptManager
    participant Runtime as Runtime Executor
    participant Shell as Shell Executor
    participant Logger as DSL Logger

    TaskDec->>ScriptMgr: register_task(task_info, source_file)
    ScriptMgr->>ScriptMgr: Store task in _task_registry
    ScriptMgr->>ScriptMgr: Create TaskResult for task
    TaskDec->>TaskDec: Assign result container to wrapper.status

    Runtime->>ScriptMgr: get_tasks_from_file(rel_filename)
    ScriptMgr-->>Runtime: Return list of tasks for file
    
    Runtime->>Runtime: Execute normal phase tasks
    Runtime->>Shell: Execute each task
    Shell->>Shell: Run subprocess with capture_output=True
    Shell->>Logger: Log output if enabled
    Shell-->>Runtime: Return execution result
    
    Runtime->>ScriptMgr: get_task_result(task_name)
    ScriptMgr-->>Runtime: Return TaskResult
    Runtime->>ScriptMgr: _set_result(result)
    
    Runtime->>Runtime: Execute always phase tasks
    Runtime->>Shell: Execute each always task
    Shell-->>Runtime: Return execution result
    Runtime->>ScriptMgr: _set_result(result)
    
    Runtime->>ScriptMgr: clear_tasks(file_path)
    ScriptMgr->>ScriptMgr: Clear tasks and results for file
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes


A hop, a skip, a manager's grace,
No globals now in this new place, 🐰
Tasks are tracked with DSL's name,
Registry refactored, logs aflame,
Clean and bright, a scripted dream!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is vague and generic, using non-descriptive terms like 'Improve' that don't convey the specific nature of the substantial architectural changes made to the DSL system. Consider a more specific title that reflects the main change, such as '[core] Refactor DSL task management to use centralized ScriptManager' or '[core] Move DSL task registry from global state to ScriptManager'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 85.19% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
projects/core/dsl/task.py (1)

180-184: ⚠️ Potential issue | 🔴 Critical

Register tasks with the same file key that runtime uses.

task() stores tasks under os.path.relpath(definition_filename), but execute_tasks() looks them up with Path(filename).relative_to(env.FORGE_HOME) in projects/core/dsl/runtime.py Lines 69-73. When the process CWD is not FORGE_HOME, those keys diverge and Line 105 in projects/core/dsl/runtime.py raises No tasks found for file ....

📍 Suggested normalization
-    # Make filename relative to current working directory
-    try:
-        rel_definition_filename = os.path.relpath(definition_filename)
-    except ValueError:
-        rel_definition_filename = definition_filename
+    # Make filename relative to FORGE_HOME to match runtime lookup
+    try:
+        rel_definition_filename = str(Path(definition_filename).relative_to(env.FORGE_HOME))
+    except ValueError:
+        rel_definition_filename = definition_filename

Also add from pathlib import Path near the imports.

Also applies to: 223-224

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/dsl/task.py` around lines 180 - 184, task registration uses
os.path.relpath(definition_filename) which doesn't match execute_tasks() lookup
(Path(filename).relative_to(env.FORGE_HOME)), causing missing-task errors;
change task() to normalize the key using
Path(definition_filename).relative_to(env.FORGE_HOME) (wrapped in try/except to
fall back to Path(definition_filename).resolve().relative_to(Path.cwd()) or the
original string if relative_to fails) and convert to str before storing, and
make the same change for the second occurrence around lines 223-224; also add
"from pathlib import Path" to the imports so Path is available.
projects/core/dsl/shell.py (1)

27-34: ⚠️ Potential issue | 🟠 Major

Don't force every command through in-memory capture.

Hardcoding capture_output=True means every command now buffers stdout/stderr in RAM, including the stdout_dest case where the payload is written to disk afterward. That removes the previous escape hatch for large or streaming commands and can turn artifact captures into a memory spike.

♻️ Suggested fix
 def run(
         command: str,
         check: bool = True,
+        capture_output: bool = True,
         shell: bool = True,
         stdout_dest: Optional[Union[str, Path]] = None,
         log_stdout: bool = True,
         log_stderr: bool = True,
 ) -> CommandResult:
@@
         result = subprocess.run(
             command,
             shell=shell,
             check=False,  # We handle check ourselves
-            capture_output=True,
+            capture_output=capture_output,
             text=True
         )
+
+        stdout = result.stdout or ""
+        stderr = result.stderr or ""
 
         cmd_result = CommandResult(
-            stdout=result.stdout or "",
-            stderr=result.stderr or "",
+            stdout=stdout,
+            stderr=stderr,
             returncode=result.returncode,
             command=command
         )
@@
-        if result.stdout and result.stdout.strip():
+        if stdout.strip():
             if stdout_dest:
                 logger.info(f"| <stdout saved into {stdout_dest}>")
             elif log_stdout:
-                logger.info(f"| <stdout> {result.stdout.strip()}")
+                logger.info(f"| <stdout> {stdout.strip()}")
             else:
                 logger.info("| <stdout logging skipped>")
 
-        if result.stderr and result.stderr.strip():
+        if stderr.strip():
             if log_stderr:
-                logger.info(f"| <stderr> {result.stderr.strip()}")
+                logger.info(f"| <stderr> {stderr.strip()}")
             else:
                 logger.info("| <stderr logging skipped>")
 
-        if not (result.stdout.strip() or result.stderr.strip()):
+        if not (stdout.strip() or stderr.strip()):
             logger.info("| <no output>")

If stdout_dest is the main large-output path, an even better follow-up is to stream directly to the file instead of capturing first.

Also applies to: 53-90

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/dsl/shell.py` around lines 27 - 34, The run function currently
forces capture_output=True which buffers all stdout/stderr in memory; change it
to decide how to wire subprocess stdout/stderr based on the parameters: if
stdout_dest is provided, open that file and pass the file handle as stdout to
subprocess (and do not set capture_output), otherwise if log_stdout or
log_stderr is true use subprocess.PIPE to capture those streams; ensure stderr
is similarly routed (PIPE or file/None) rather than always capturing both.
Update the logic in run (and the corresponding block around lines 53-90) to set
stdout/stderr args dynamically and close any opened file handles; consider
recommending a streaming path (directly writing to stdout_dest) as a follow-up
for very large outputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/core/dsl/runtime.py`:
- Around line 100-114: The first execution loop should skip tasks marked with
always_execute and only run non-always tasks: in the loop that pops from
file_tasks (using get_script_manager().get_tasks_from_file and variable
current_task_info), if current_task_info.always_execute is True then defer it
(e.g., append to a separate always_tasks list) instead of executing it; call
_execute_single_task(current_task_info, args, shared_context) only for
non-always tasks. On failure, the recovery loop must iterate over that
always_tasks list (not all remaining file_tasks) and call
_execute_single_task(task_info, args, shared_context) for each, ensuring the
loop variable is task_info (not the undefined name `task`) so no NameError
occurs.

In `@projects/core/dsl/script_manager.py`:
- Around line 41-43: The _task_results dict currently keys only by task_name
causing collisions across scripts; change the keying to a file-scoped composite
key (e.g. f"{source_file}:{task_name}" or a tuple (source_file, task_name))
wherever TaskResult entries are created, accessed, or deleted: update the
registration logic that populates self._task_results (refer to _task_registry
and the method that registers tasks), update lookups that read task
results/status (wherever _task_results[task_name] is used), and update
clear_tasks(source_file) to remove only entries whose composite key matches the
given source_file; ensure TaskResult creation, updates, and any
status.return_value references use the new composite key consistently to avoid
cross-file replacement.

---

Outside diff comments:
In `@projects/core/dsl/shell.py`:
- Around line 27-34: The run function currently forces capture_output=True which
buffers all stdout/stderr in memory; change it to decide how to wire subprocess
stdout/stderr based on the parameters: if stdout_dest is provided, open that
file and pass the file handle as stdout to subprocess (and do not set
capture_output), otherwise if log_stdout or log_stderr is true use
subprocess.PIPE to capture those streams; ensure stderr is similarly routed
(PIPE or file/None) rather than always capturing both. Update the logic in run
(and the corresponding block around lines 53-90) to set stdout/stderr args
dynamically and close any opened file handles; consider recommending a streaming
path (directly writing to stdout_dest) as a follow-up for very large outputs.

In `@projects/core/dsl/task.py`:
- Around line 180-184: task registration uses
os.path.relpath(definition_filename) which doesn't match execute_tasks() lookup
(Path(filename).relative_to(env.FORGE_HOME)), causing missing-task errors;
change task() to normalize the key using
Path(definition_filename).relative_to(env.FORGE_HOME) (wrapped in try/except to
fall back to Path(definition_filename).resolve().relative_to(Path.cwd()) or the
original string if relative_to fails) and convert to str before storing, and
make the same change for the second occurrence around lines 223-224; also add
"from pathlib import Path" to the imports so Path is available.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5bf2d60a-9767-484a-b5a3-0cfbca2ebe84

📥 Commits

Reviewing files that changed from the base of the PR and between 9026197 and f3eadfc.

📒 Files selected for processing (8)
  • projects/core/dsl/__init__.py
  • projects/core/dsl/log.py
  • projects/core/dsl/runner.py
  • projects/core/dsl/runtime.py
  • projects/core/dsl/script_manager.py
  • projects/core/dsl/shell.py
  • projects/core/dsl/task.py
  • projects/core/dsl/toolbox.py

Comment on lines +100 to +114
# Execute tasks only from the calling file
script_manager = get_script_manager()
file_tasks = list(script_manager.get_tasks_from_file(rel_filename))

if not file_tasks:
logger.error(f"No tasks found for file: {rel_filename}")
log_completion_banner(function_args, status="NO_TASKS")
raise RuntimeError(f"No tasks found for file: {rel_filename}")

execution_error = None

# Execute normal tasks
try:
for task_info in normal_tasks:
_execute_single_task(task_info, args, shared_context)
while file_tasks:
current_task_info = file_tasks.pop(0)
_execute_single_task(current_task_info, args, shared_context)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

@always handling is broken in both execution phases.

The first loop never checks always_execute, so @always tasks run like normal tasks. After a failure, the second loop then runs every remaining task as if it were @always, and Line 139 calls _execute_single_task(task_info, ...) with an undefined name. A normal task failure will either execute non-@always tasks unexpectedly or crash with NameError.

🛠️ Suggested structure
-            file_tasks = list(script_manager.get_tasks_from_file(rel_filename))
-
-            if not file_tasks:
+            registered_tasks = list(script_manager.get_tasks_from_file(rel_filename))
+
+            if not registered_tasks:
                 logger.error(f"No tasks found for file: {rel_filename}")
                 log_completion_banner(function_args, status="NO_TASKS")
                 raise RuntimeError(f"No tasks found for file: {rel_filename}")
+
+            file_tasks = [task for task in registered_tasks if not task.get('always_execute')]
+            always_tasks = [task for task in registered_tasks if task.get('always_execute')]
@@
-            if file_tasks:
-                logging.warning("Executing the `@always` tasks ...")
-                while file_tasks:
+            if always_tasks:
+                logger.warning("Executing the `@always` tasks ...")
+                while always_tasks:
                     try:
-                        current_task_info = file_tasks.pop(0)
-                        _execute_single_task(task_info, args, shared_context)
+                        current_task_info = always_tasks.pop(0)
+                        _execute_single_task(current_task_info, args, shared_context)

Also applies to: 133-140

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/dsl/runtime.py` around lines 100 - 114, The first execution
loop should skip tasks marked with always_execute and only run non-always tasks:
in the loop that pops from file_tasks (using
get_script_manager().get_tasks_from_file and variable current_task_info), if
current_task_info.always_execute is True then defer it (e.g., append to a
separate always_tasks list) instead of executing it; call
_execute_single_task(current_task_info, args, shared_context) only for
non-always tasks. On failure, the recovery loop must iterate over that
always_tasks list (not all remaining file_tasks) and call
_execute_single_task(task_info, args, shared_context) for each, ensuring the
loop variable is task_info (not the undefined name `task`) so no NameError
occurs.

Comment on lines +41 to +43
self._task_registry: Dict[str, List[dict]] = {}
# Task results organized by task name
self._task_results: Dict[str, TaskResult] = {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Task results need a file-scoped key, not just task_name.

_task_results[task_name] collides as soon as two scripts define the same task function name. The later registration replaces the earlier TaskResult, so some_task.status.return_value can point at the wrong execution, and clear_tasks(source_file) can delete another file's status object.

Also applies to: 45-60, 81-107

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/dsl/script_manager.py` around lines 41 - 43, The _task_results
dict currently keys only by task_name causing collisions across scripts; change
the keying to a file-scoped composite key (e.g. f"{source_file}:{task_name}" or
a tuple (source_file, task_name)) wherever TaskResult entries are created,
accessed, or deleted: update the registration logic that populates
self._task_results (refer to _task_registry and the method that registers
tasks), update lookups that read task results/status (wherever
_task_results[task_name] is used), and update clear_tasks(source_file) to remove
only entries whose composite key matches the given source_file; ensure
TaskResult creation, updates, and any status.return_value references use the new
composite key consistently to avoid cross-file replacement.

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 16, 2026

merging this before working on ruff
I'll finalize this in a subsequent PR

@kpouget kpouget merged commit 53eb5a2 into openshift-psap:main Apr 16, 2026
3 of 4 checks passed
@kpouget kpouget deleted the core branch April 16, 2026 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant