-
Notifications
You must be signed in to change notification settings - Fork 22
tracer improvements #970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
tracer improvements #970
Conversation
…gle rank_functions
remove unittests remnants
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
⚡️ Codeflash found optimizations for this PR📄 2,062% (20.62x) speedup for
|
The optimization replaces an O(N) linear search through all functions with an O(1) hash table lookup followed by iteration over only matching function names. **Key Changes:** - Added `_function_stats_by_name` index in `__init__` that maps function names to lists of (key, stats) tuples - Modified `get_function_stats_summary` to first lookup candidates by function name, then iterate only over those candidates **Why This is Faster:** The original code iterates through ALL function stats (22,603 iterations in the profiler results) for every lookup. The optimized version uses a hash table to instantly find only the functions with matching names, then iterates through just those candidates (typically 1-2 functions). **Performance Impact:** - **Small datasets**: 15-30% speedup as shown in basic test cases - **Large datasets**: Dramatic improvement - the `test_large_scale_performance` case with 900 functions shows **3085% speedup** (66.7μs → 2.09μs) - **Overall benchmark**: 2061% speedup demonstrates the optimization scales excellently with dataset size **When This Optimization Shines:** - Large codebases with many profiled functions (where the linear search becomes expensive) - Repeated function lookups (if this method is called frequently) - Cases with many unique function names but few duplicates per name The optimization maintains identical behavior while transforming the algorithm from O(N) per lookup to O(average functions per name) per lookup, which is typically O(1) in practice. Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
|
This PR is now faster! 🚀 @KRRT7 accepted my optimizations from: |
|
This PR is now faster! 🚀 codeflash-ai[bot] accepted my code suggestion above. |
⚡️ Codeflash found optimizations for this PR📄 26% (0.26x) speedup for
|
| def is_pytest_infrastructure(filename: str, function_name: str) -> bool: | ||
| """Check if a function is part of pytest infrastructure that should be excluded from ranking. | ||
| This filters out pytest internal functions, hooks, and test framework code that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't use this function for filtering. we have afilter_files_optimized function in codeflash/tracing/tracing_new_process.py. We only want to rank the functions that we can eventually trace and optimize. that filter function helps us get that. it will also filter out the pytest function noise
PR Type
Enhancement, Tests
Description
Merge rank/filter into
rank_functionsFile-relative importance filtering added
Simplify replay test generation API
Minor compatibility and safety tweaks
Diagram Walkthrough
File Walkthrough
function_ranker.py
Consolidate ranking and add file-relative importancecodeflash/benchmarking/function_ranker.py
_get_function_statstoget_function_stats_summary.rank_functions.replay_test.py
Simplify replay test generation to pytest-onlycodeflash/benchmarking/replay_test.py
replay_test.py
Streamline tracing replay to pytest templatescodeflash/tracing/replay_test.py
tracing_new_process.py
Align tracer outputs with replay tests and safety tweakscodeflash/tracing/tracing_new_process.py
test_function_ranker.py
Update tests for consolidated ranking behaviortests/test_function_ranker.py
rank_functionssemantics.rerank_and_filter_functionstest.