Skip to content

Conversation

@KRRT7
Copy link
Contributor

@KRRT7 KRRT7 commented Dec 14, 2025

PR Type

Enhancement, Tests


Description

  • Merge rank/filter into rank_functions

  • File-relative importance filtering added

  • Simplify replay test generation API

  • Minor compatibility and safety tweaks


Diagram Walkthrough

flowchart LR
  ranker["FunctionRanker: consolidated ranking/filtering"] -- "uses" --> fileRel["File-relative importance calc"]
  ranker -- "sort by" --> ttx["ttX score accessor"]
  tracingReplay["Tracing replay test generator"] -- "drop unittest support" --> pytestOnly["Pytest-only tests"]
  tracingProcess["Tracer new process"] -- "derive output path" --> testsDir["Tests directory (replay)"]
  tracingProcess -- "generate tests" --> tracingReplay
  testsUpdated["Tests updated"] -- "assert filtered+ranked" --> ranker
Loading

File Walkthrough

Relevant files
Enhancement
function_ranker.py
Consolidate ranking and add file-relative importance         

codeflash/benchmarking/function_ranker.py

  • Rename _get_function_stats to get_function_stats_summary.
  • Consolidate filter+rank into rank_functions.
  • Compute importance relative to target files.
  • Improve logging and keep ttX-based sorting.
+59/-48 
replay_test.py
Simplify replay test generation to pytest-only                     

codeflash/benchmarking/replay_test.py

  • Remove unittest/pytest switch; default to pytest style.
  • Simplify imports and test template generation.
  • Adjust function signatures and return docs.
  • Uniform 4-space indentation for tests.
+7/-21   
replay_test.py
Streamline tracing replay to pytest templates                       

codeflash/tracing/replay_test.py

  • Remove unittest option and assertions.
  • Simplify imports and test scaffolding.
  • Standardize function signature and indentation.
  • Generate standalone pytest tests.
+5/-15   
tracing_new_process.py
Align tracer outputs with replay tests and safety tweaks 

codeflash/tracing/tracing_new_process.py

  • Derive trace output path beside replay tests.
  • Remove reliance on config test_framework.
  • Pass simplified params to replay generator.
  • Safeguard dict iteration in pstats conversion.
+10/-5   
Tests
test_function_ranker.py
Update tests for consolidated ranking behavior                     

tests/test_function_ranker.py

  • Update tests to new rank_functions semantics.
  • Remove rerank_and_filter_functions test.
  • Add assertions for filtering and ordering.
+7/-12   

@github-actions
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

In file-relative total time calculation, membership check uses substring matching against s["filename"]; this can misattribute functions when filenames overlap (e.g., foo.py vs myfoo.py). Consider stricter path matching to avoid incorrect importance filtering.

target_files = {func.file_path.name for func in functions_to_optimize}
# Calculate total time only from functions in these files
total_program_time = sum(
    s["own_time_ns"]
    for s in self._function_stats.values()
    if s.get("own_time_ns", 0) > 0 and any(target_file in s["filename"] for target_file in target_files)
)
logger.debug(
Output Path Logic

Tracer now derives output_file based on tests_root via get_test_file_path; if config["tests_root"] is missing/invalid or functions is long, path resolution could fail or create overly long filenames. Validate existence, handle exceptions, and possibly truncate/sanitize function_path.

# Place trace file next to replay tests in the tests directory
from codeflash.verification.verification_utils import get_test_file_path
function_path = "_".join(functions) if functions else self.sanitized_filename
test_file_path = get_test_file_path(
    test_dir=Path(config["tests_root"]), function_name=function_path, test_type="replay"
)
trace_filename = test_file_path.stem + ".trace"
self.output_file = test_file_path.parent / trace_filename
Iteration Safety

make_pstats_compatible now wraps dict iteration with list(...), which is good; ensure large dicts don’t cause memory spikes and consider using tuple(...) or copying keys only if size is significant.

def make_pstats_compatible(self) -> None:
    # delete the extra class_name item from the function tuple
    self.files = []
    self.top_level = []
    new_stats = {}
    for func, (cc, ns, tt, ct, callers) in list(self.stats.items()):
        new_callers = {(k[0], k[1], k[2]): v for k, v in callers.items()}
        new_stats[(func[0], func[1], func[2])] = (cc, ns, tt, ct, new_callers)
    new_timings = {}
    for func, (cc, ns, tt, ct, callers) in list(self.timings.items()):
        new_callers = {(k[0], k[1], k[2]): v for k, v in callers.items()}
        new_timings[(func[0], func[1], func[2])] = (cc, ns, tt, ct, new_callers)
    self.stats = new_stats
    self.timings = new_timings

@github-actions
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Create output directory proactively

Ensure the output directory exists before assigning self.output_file. Without
creating the parent directory, subsequent writes can fail with FileNotFoundError
when the tests directory tree is missing.

codeflash/tracing/tracing_new_process.py [131-135]

 self.sanitized_filename = self.sanitize_to_filename(command)
 # Place trace file next to replay tests in the tests directory
 from codeflash.verification.verification_utils import get_test_file_path
 function_path = "_".join(functions) if functions else self.sanitized_filename
 test_file_path = get_test_file_path(
     test_dir=Path(config["tests_root"]), function_name=function_path, test_type="replay"
 )
+test_file_path.parent.mkdir(parents=True, exist_ok=True)
 trace_filename = test_file_path.stem + ".trace"
 self.output_file = test_file_path.parent / trace_filename
Suggestion importance[1-10]: 8

__

Why: Ensuring the parent directory exists before writing avoids a likely FileNotFoundError in new setups; this is a practical reliability fix with high impact and aligns with the new path logic.

Medium
Safely handle missing filenames

Guard against missing or None filename in stats to avoid TypeError when using in.
Use str(...) or an .get() default to ensure safe substring checks. This prevents
crashes when profile entries lack a filename.

codeflash/benchmarking/function_ranker.py [128-132]

 total_program_time = sum(
     s["own_time_ns"]
     for s in self._function_stats.values()
-    if s.get("own_time_ns", 0) > 0 and any(target_file in s["filename"] for target_file in target_files)
+    if s.get("own_time_ns", 0) > 0
+    and any(target_file in str(s.get("filename", "")) for target_file in target_files)
 )
Suggestion importance[1-10]: 7

__

Why: Correctly guards against missing filename by coercing to string, preventing a potential TypeError during substring checks; impact is moderate as it improves robustness without changing behavior otherwise.

Medium
General
Remove unintended console output

Avoid using console.rule() unconditionally, as it performs I/O during ranking and
can break in non-console environments. Remove it or guard behind a verbosity flag to
prevent unintended side effects in library code.

codeflash/benchmarking/function_ranker.py [163-164]

 if total_program_time == 0:
     logger.warning("Total program time is zero, cannot determine function importance.")
     functions_to_rank = functions_to_optimize
 else:
     functions_to_rank = []
     for func in functions_to_optimize:
         func_stats = self.get_function_stats_summary(func)
         if func_stats and func_stats.get("own_time_ns", 0) > 0:
             importance = func_stats["own_time_ns"] / total_program_time
             if importance >= DEFAULT_IMPORTANCE_THRESHOLD:
                 functions_to_rank.append(func)
             else:
                 logger.debug(
                     f"Filtering out function {func.qualified_name} with importance "
                     f"{importance:.2%} (below threshold {DEFAULT_IMPORTANCE_THRESHOLD:.2%})"
                 )
 
     logger.info(
         f"Filtered down to {len(functions_to_rank)} important functions "
         f"from {len(functions_to_optimize)} total functions"
     )
-    console.rule()
Suggestion importance[1-10]: 5

__

Why: Removing console.rule() reduces unnecessary I/O in library code; it's a minor quality improvement and context-dependent, so moderate impact.

Low

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Dec 14, 2025

⚡️ Codeflash found optimizations for this PR

📄 2,062% (20.62x) speedup for FunctionRanker.get_function_stats_summary in codeflash/benchmarking/function_ranker.py

⏱️ Runtime : 1.48 milliseconds 68.6 microseconds (best of 115 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch ranking-changes).

Static Badge

The optimization replaces an O(N) linear search through all functions with an O(1) hash table lookup followed by iteration over only matching function names.

**Key Changes:**
- Added `_function_stats_by_name` index in `__init__` that maps function names to lists of (key, stats) tuples
- Modified `get_function_stats_summary` to first lookup candidates by function name, then iterate only over those candidates

**Why This is Faster:**
The original code iterates through ALL function stats (22,603 iterations in the profiler results) for every lookup. The optimized version uses a hash table to instantly find only the functions with matching names, then iterates through just those candidates (typically 1-2 functions).

**Performance Impact:**
- **Small datasets**: 15-30% speedup as shown in basic test cases
- **Large datasets**: Dramatic improvement - the `test_large_scale_performance` case with 900 functions shows **3085% speedup** (66.7μs → 2.09μs)
- **Overall benchmark**: 2061% speedup demonstrates the optimization scales excellently with dataset size

**When This Optimization Shines:**
- Large codebases with many profiled functions (where the linear search becomes expensive)
- Repeated function lookups (if this method is called frequently)
- Cases with many unique function names but few duplicates per name

The optimization maintains identical behavior while transforming the algorithm from O(N) per lookup to O(average functions per name) per lookup, which is typically O(1) in practice.

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Dec 14, 2025

This PR is now faster! 🚀 @KRRT7 accepted my optimizations from:

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Dec 14, 2025

This PR is now faster! 🚀 codeflash-ai[bot] accepted my code suggestion above.

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Dec 14, 2025

⚡️ Codeflash found optimizations for this PR

📄 26% (0.26x) speedup for FunctionRanker.rank_functions in codeflash/benchmarking/function_ranker.py

⏱️ Runtime : 42.5 milliseconds 33.8 milliseconds (best of 57 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

def is_pytest_infrastructure(filename: str, function_name: str) -> bool:
"""Check if a function is part of pytest infrastructure that should be excluded from ranking.
This filters out pytest internal functions, hooks, and test framework code that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't use this function for filtering. we have afilter_files_optimized function in codeflash/tracing/tracing_new_process.py. We only want to rank the functions that we can eventually trace and optimize. that filter function helps us get that. it will also filter out the pytest function noise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants