Skip to content

Conversation

@mashraf-222
Copy link
Contributor

@mashraf-222 mashraf-222 commented Nov 28, 2025

PR Type

Bug fix, Enhancement


Description

  • Normalize paths for Windows compatibility

  • Use worktree paths in LSP initialization

  • Robust relative path computation with fallbacks

  • Case-insensitive filtering of files


Diagram Walkthrough

flowchart LR
  A["Discovery: filter_functions"] -- "normcase comparisons" --> B["Accurate test/ignore/submodule filtering"]
  C["LSP: initialize_function_optimization"] -- "use worktree paths" --> D["Correct files_inside_context on Windows"]
  E["Optimizer: mirror_path"] -- "resolve + normalized fallback" --> F["Robust relative path mapping"]
Loading

File Walkthrough

Relevant files
Bug fix
functions_to_optimize.py
Case-insensitive path filtering for Windows                           

codeflash/discovery/functions_to_optimize.py

  • Normalize roots and file paths with normcase.
  • Case-insensitive startswith checks for tests, ignore, submodules.
  • Ensure module root filtering uses normalized paths.
+8/-4     
beta.py
Use resolved worktree paths in LSP init                                   

codeflash/lsp/beta.py

  • Use worktree file path instead of document.path.
  • Resolve helper paths before returning file list.
  • Prevent malformed Windows paths (missing drive letter).
+5/-2     
Enhancement
optimizer.py
Robust mirror_path with Windows-safe normalization             

codeflash/optimization/optimizer.py

  • Resolve paths and normalize case before relative_to.
  • Add robust fallback when relative_to fails.
  • Preserve original formatting when extracting relative path.
+42/-1   

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Logic Change

The path checks now require a trailing separator for startswith comparisons. Verify behavior for files exactly equal to tests_root, ignore_path, submodule_path, and module_root to ensure they are still classified correctly.

if file_path_normalized.startswith(tests_root_normalized + os.sep):
    test_functions_removed_count += len(_functions)
    continue
if file_path in ignore_paths or any(
    file_path_normalized.startswith(os.path.normcase(str(ignore_path)) + os.sep) for ignore_path in ignore_paths
):
    ignore_paths_removed_count += 1
    continue
if file_path in submodule_paths or any(
    file_path_normalized.startswith(os.path.normcase(str(submodule_path)) + os.sep) for submodule_path in submodule_paths
):
    submodule_ignored_paths_count += 1
    continue
if path_belongs_to_site_packages(Path(file_path)):
    site_packages_removed_count += len(_functions)
    continue
if not file_path_normalized.startswith(module_root_normalized + os.sep):
    non_modules_removed_count += len(_functions)
    continue
Path Source

Switching from document.path to server.optimizer.args.file.resolve() changes which file is reported in files_inside_context. Confirm this aligns with client expectations and multi-file buffers, and that helpers are resolved to existing paths on all platforms.

# Use the worktree file path (which is normalized) instead of document.path
# document.path might be malformed on Windows (missing drive letter)
worktree_file_path = str(server.optimizer.args.file.resolve())
files = [worktree_file_path]

_, _, original_helpers = server.current_optimization_init_result
files.extend([str(helper_path.resolve()) for helper_path in original_helpers])

return {"functionName": params.functionName, "status": "success", "files_inside_context": files}
Error Handling

mirror_path now raises ValueError when paths are not under src_root. Ensure callers handle this exception; consider logging context or returning the original path when mirroring is impossible if that is acceptable.

# Resolve both paths to absolute paths to handle Windows path normalization issues
# This ensures paths with/without drive letters are handled correctly
try:
    path_resolved = path.resolve()
except (OSError, RuntimeError):
    # If resolve fails (e.g., path doesn't exist or is malformed), try absolute()
    path_resolved = path.absolute() if not path.is_absolute() else path

try:
    src_root_resolved = src_root.resolve()
except (OSError, RuntimeError):
    src_root_resolved = src_root.absolute() if not src_root.is_absolute() else src_root

# Normalize paths using os.path.normpath and normcase for cross-platform compatibility
path_str = os.path.normpath(str(path_resolved))
src_root_str = os.path.normpath(str(src_root_resolved))

# Convert to lowercase for case-insensitive comparison on Windows
path_normalized = os.path.normcase(path_str)
src_root_normalized = os.path.normcase(src_root_str)

# Try using Path.relative_to with resolved paths first
try:
    relative_path = path_resolved.relative_to(src_root_resolved)
except ValueError:
    # If relative_to fails, manually extract the relative path using normalized strings
    if path_normalized.startswith(src_root_normalized):
        # Extract relative path manually
        # Use the original path_str to preserve proper path format
        if path_str.startswith(src_root_str):
            relative_str = path_str[len(src_root_str) :].lstrip(os.sep)
            relative_path = Path(relative_str)
        else:
            # Fallback: use normalized paths
            relative_str = path_normalized[len(src_root_normalized) :].lstrip(os.sep)
            relative_path = Path(relative_str)
    else:
        raise ValueError(
            f"Path {path_resolved} (normalized: {path_normalized}) is not relative to "
            f"{src_root_resolved} (normalized: {src_root_normalized})"
        )

return dest_root / relative_path

@github-actions
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent prefix false positives

Guard against missing or extra separators when slicing; without ensuring a trailing
separator on the root, parents like "C:\proj" will match "C:\projectX". Append
os.sep to the normalized roots before startswith/slicing.

codeflash/optimization/optimizer.py [514-528]

-if path_normalized.startswith(src_root_normalized):
-    # Extract relative path manually
-    # Use the original path_str to preserve proper path format
-    if path_str.startswith(src_root_str):
-        relative_str = path_str[len(src_root_str) :].lstrip(os.sep)
+root_norm_with_sep = src_root_normalized.rstrip(os.sep) + os.sep
+path_norm_with_sep = path_normalized
+if path_norm_with_sep.startswith(root_norm_with_sep):
+    if path_str.startswith(src_root_str.rstrip(os.sep) + os.sep):
+        relative_str = path_str[len(src_root_str.rstrip(os.sep) + os.sep) :]
         relative_path = Path(relative_str)
     else:
-        # Fallback: use normalized paths
-        relative_str = path_normalized[len(src_root_normalized) :].lstrip(os.sep)
+        relative_str = path_norm_with_sep[len(root_norm_with_sep) :]
         relative_path = Path(relative_str)
Suggestion importance[1-10]: 8

__

Why: Ensuring a trailing separator when performing prefix checks avoids false positives like 'C:\proj' vs 'C:\projectX'; this is a precise fix in the new manual relative-path logic and meaningfully hardens path handling.

Medium
Normalize path separators too

Normalize separators as well to avoid false negatives when comparing paths that may
contain mixed slashes on Windows. Use os.path.normpath before os.path.normcase for
every path string involved in startswith checks.

codeflash/discovery/functions_to_optimize.py [675-684]

-tests_root_normalized = os.path.normcase(tests_root_str)
-module_root_normalized = os.path.normcase(module_root_str)
+tests_root_normalized = os.path.normcase(os.path.normpath(tests_root_str))
+module_root_normalized = os.path.normcase(os.path.normpath(module_root_str))
 ...
-file_path_normalized = os.path.normcase(file_path)
+file_path_normalized = os.path.normcase(os.path.normpath(file_path))
 if file_path_normalized.startswith(tests_root_normalized + os.sep):
Suggestion importance[1-10]: 7

__

Why: Using os.path.normpath before normcase is a correct, context-aware improvement that prevents mixed-separator false negatives on Windows; it directly applies to the new normalization and startswith checks around lines 675-684.

Medium
Fully normalize ignore path prefixes

Ensure every candidate prefix uses both normpath and normcase; otherwise mixed
separators in ignore_paths can break the prefix check. Apply the same normalization
used for file_path_normalized.

codeflash/discovery/functions_to_optimize.py [686-688]

 if file_path in ignore_paths or any(
-    file_path_normalized.startswith(os.path.normcase(str(ignore_path)) + os.sep) for ignore_path in ignore_paths
+    file_path_normalized.startswith(os.path.normcase(os.path.normpath(str(ignore_path))) + os.sep)
+    for ignore_path in ignore_paths
 ):
Suggestion importance[1-10]: 7

__

Why: Extending normalization to the ignore_paths prefixes aligns with the newly added file_path normalization and reduces Windows path mismatch risks, improving correctness with minimal change.

Medium

@mashraf-222 mashraf-222 requested a review from KRRT7 December 1, 2025 18:59
@KRRT7
Copy link
Contributor

KRRT7 commented Dec 3, 2025

@mashraf-222 tests are failing

@KRRT7
Copy link
Contributor

KRRT7 commented Dec 8, 2025

this feels extremely complicated for what should be a simple fix, can you fix the merge conflicts and I'll review it again.

# Attempt to remove worktree using git command with retries
for attempt in range(max_retries):
try:
attempt_num = attempt + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Not sure if this will execute successfully on win for multiple retries. If this then as discussed , this might also require some time delay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the loop starts with for attempt in range(max_retries):, which tries up to 3 times for windows, each iteration calls repository.git.worktree("remove", "--force", str(worktree_dir)). If it succeeds, it returns immediately.

If it fails, the code catches git.exc.GitCommandError and checks if the error message contains "permission denied" or "access is denied" to determine if it's a permission error. The condition if is_permission_error and attempt < max_retries - 1: decides whether to retry: it must be a permission error and not the last attempt.

When retrying, it logs the retry, then calls time.sleep(retry_delay), which starts at 0.5 seconds and doubles each time (0.5s then 1.0s then 2.0s). This gives Windows time to release file locks. After the sleep, retry_delay *= 2 increases the delay for the next retry, and the loop continues.

@mashraf-222
Copy link
Contributor Author

this feels extremely complicated for what should be a simple fix, can you fix the merge conflicts and I'll review it again.

I fixed the conflicts and the workflow tests, it looks extremely complicated because this PR is fixing lots of bugs for windows vsc.

@mashraf-222
Copy link
Contributor Author

Windows Compatibility Fixes for CodeFlash Core

This PR addresses critical Windows compatibility issues that were preventing the CodeFlash extension from working correctly on Windows systems. These fixes ensure reliable subprocess execution, proper file handling, and consistent path resolution across all platforms.

Overview

The changes in this PR focus on fixing Windows-specific bugs related to subprocess management, file locking, path resolution, and LSP communication. These issues were causing hangs, crashes, and incorrect behavior when running CodeFlash on Windows.

Bugs Fixed

1. Subprocess Deadlocks on Windows

  • Location: verification/test_runner.py, benchmarking/trace_benchmarks.py
  • Issue: Using subprocess.run() with capture_output=True on Windows caused deadlocks when child processes produced output faster than the parent could read it, especially with pytest and coverage.
  • Fix: Implemented Windows-specific subprocess handling using Popen.communicate() which properly drains stdout/stderr concurrently using threads. Added CREATE_NEW_PROCESS_GROUP flag for proper process tree termination and stdin=subprocess.DEVNULL to prevent child processes from waiting for console input.

2. File Locking Issues in Git Worktrees

  • Location: code_utils/git_worktree_utils.py
  • Issue: Windows file locking prevented proper cleanup of git worktrees, causing "Permission denied" errors when removing worktree directories.
  • Fix: Implemented robust retry logic with exponential backoff for Windows, added manual directory cleanup fallback, and created a Windows-specific error handler that removes read-only attributes during directory deletion.

3. Inconsistent Path Resolution

  • Location: discovery/functions_to_optimize.py, optimization/optimizer.py
  • Issue: Mixed use of strict and non-strict path resolution caused test failures on Windows where paths with/without drive letters or different case sensitivity weren't handled consistently.
  • Fix: Introduced _resolve_path_consistent() helper that uses strict resolution for existing paths and non-strict for non-existent paths. Updated mirror_path() to properly normalize Windows paths using os.path.normpath and os.path.normcase for case-insensitive comparison.

4. LSP Message Delimiter Parsing Error

  • Location: lsp/lsp_message.py
  • Issue: The message delimiter was incorrectly set as a raw Unicode character instead of an escape sequence, causing parsing failures.
  • Fix: Changed delimiter from "\u241f" to "\\u241F" (proper escape sequence).

5. Malformed Windows Paths in LSP

  • Location: lsp/beta.py
  • Issue: Using document.path directly could result in malformed paths on Windows (missing drive letters or incorrect separators).
  • Fix: Use resolved worktree file paths instead of document.path, ensuring all paths are properly normalized.

6. Coverage Database File Locking

  • Location: verification/test_runner.py
  • Issue: Running coverage erase as a subprocess on Windows could cause file locking issues.
  • Fix: On Windows, directly delete the coverage database file instead of using the coverage erase subprocess command.

7. Console Window Spawning in LSP Mode

  • Location: discovery/discover_unit_tests.py
  • Issue: Subprocess calls on Windows could spawn console windows that hang the LSP server.
  • Fix: Added CREATE_NO_WINDOW flag for Windows subprocess calls to prevent console window spawning.

8. LSP Progress Bar Compatibility

  • Location: cli_cmds/console.py
  • Issue: Progress bars were incompatible with LSP mode, causing UI issues.
  • Fix: Return a dummy progress object when LSP is enabled, allowing the code to run without actual progress bar rendering.

Code Changes Summary

Core Subprocess Handling

  • verification/test_runner.py: Complete rewrite of execute_test_subprocess() with Windows-specific handling using Popen.communicate(), process groups, and proper timeout handling. The communicate() method uses internal threads to read from stdout and stderr simultaneously, preventing buffer overflow deadlocks that occur with subprocess.run().
  • benchmarking/trace_benchmarks.py: Similar Windows-safe subprocess implementation for benchmark tracing. Ensures benchmark subprocesses can produce large amounts of output without hanging the parent process.

Git Worktree Management

  • code_utils/git_worktree_utils.py: Major refactoring of remove_worktree() with:
    • Retry logic with exponential backoff (3 retries on Windows)
    • Manual cleanup fallback with _manual_cleanup_worktree_directory()
    • Path safety validation to prevent accidental deletions
    • Windows-specific error handler for removing read-only files
    • All cleanup operations include multiple safety checks to ensure only worktree directories under the cache location can be deleted, preventing accidental removal of user files.

Path Resolution

  • discovery/functions_to_optimize.py: Consistent path resolution using _resolve_path_consistent() helper for all path comparisons. Replaced string-based startswith() operations with Path.relative_to() for more robust subdirectory checking that handles edge cases automatically.
  • optimization/optimizer.py: Enhanced mirror_path() with proper Windows path normalization and case-insensitive comparison. Uses os.path.normcase() to handle Windows case-insensitive file systems and provides detailed error messages when path resolution fails.

LSP Integration

  • lsp/beta.py: Use resolved worktree paths instead of document paths. Ensures all file paths sent to the frontend are absolute and include drive letters on Windows, preventing file reading errors.
  • lsp/lsp_message.py: Fixed message delimiter escape sequence. Changed from raw Unicode character to proper escape sequence string format to ensure consistent parsing across the LSP protocol boundary.
  • cli_cmds/console.py: LSP-compatible progress bar handling. Returns a no-op dummy progress object in LSP mode to avoid console rendering errors while maintaining API compatibility.

Test Discovery

  • discovery/discover_unit_tests.py: Added CREATE_NO_WINDOW flag and improved error handling for subprocess failures. Validates pickle file existence before reading and provides detailed error messages with subprocess output when discovery fails.

Related Changes

This PR complements our internal PR that fixes Windows path handling in the VSCode extension UI components. Together, these changes ensure full Windows compatibility across the entire CodeFlash ecosystem.

@aseembits93
Copy link
Contributor

@mashraf-222 status on this?

@mashraf-222
Copy link
Contributor Author

mashraf-222 commented Dec 15, 2025

@mashraf-222 status on this?

waiting for a review.
needs to be tested on another OS (run one optimization using vsc and ensure it completes + run optimization and try to stop it and see if there are any error logs in the output ) to make sure the resolved windows bugs did not break the functionality for other OS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants