PR 6/7: Benchmarking infrastructure for scipy vs Pyomo#11
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
48a565d to
bd79668
Compare
76a65e5 to
fa20d81
Compare
4367219 to
f5b43a7
Compare
f5b43a7 to
ad70a69
Compare
ad70a69 to
f2ae241
Compare
df516f5 to
d2c8ede
Compare
f2ae241 to
c20f8aa
Compare
c20f8aa to
37d3c04
Compare
37d3c04 to
b2fd8fc
Compare
457cd46 to
885a54d
Compare
e68f7be to
d363ddd
Compare
bernalde
left a comment
There was a problem hiding this comment.
Review summary:
Blocking issues:
- Documented CLI invocation fails in script mode.
- Benchmark metrics serialize
dryness_target_metas a JSON string instead of a boolean. - Tracked reference result files include full trajectory data and add about 27 MB of generated data, contrary to the stated policy.
- The analysis notebook defaults to a missing, gitignored input file.
Nonblocking issues:
- README documents Make shortcuts that do not exist.
- The CLI eagerly requires optional Pyomo dependencies even for help/scipy-only use.
hash.inputsis documented but not emitted by generated records.
Questions: None.
Tests run and outcomes:
python benchmarks/grid_cli.py --helpfailed withModuleNotFoundError: No module named 'benchmarks'.python -m benchmarks.grid_cli --helppassed.python -m benchmarks.grid_cli generate --task Tsh --scenario baseline --vary product.A1=16 --vary ht.KC=2.75e-4 --methods scipy --out /tmp/lyopronto-pr11-scipy-smoke.jsonl --forcepassed and confirmed the serialized metric type issue.python -m ruff check benchmarkspassed.python -m pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_modelspassed: 237 passed in 577.15s. An initial run from a hyphenated temporary worktree failed during collection because this repo has a top-level__init__.pywith a relative import; rerunning from/tmp/LyoPRONTO_pr11_reviewavoided that path artifact.
I would not merge this until the blocking issues above are addressed.
|
ReviewNB bot comment: Not addressed; this was an informational notebook-diff link and did not request a repository change. |
bernalde
left a comment
There was a problem hiding this comment.
Review summary:
Blocking issues:
- The updated PR fixes the earlier review comments, but the new benchmark test module breaks the GitHub doctests workflow during collection because
benchmarksis not importable in the installed CI environment.
Nonblocking issues: None.
Questions: None.
Tests run and outcomes:
python -m ruff check benchmarks tests/test_benchmarks.pypassed.python -m pytest tests/test_benchmarks.py -vpassed: 4 passed.python -c 'import json; json.load(open("benchmarks/grid_analysis.ipynb")); print("notebook json ok")'passed.python benchmarks/grid_cli.py --helppassed.python -m benchmarks.grid_cli --helppassed.python benchmarks/grid_cli.py generate --task Tsh --scenario baseline --vary product.A1=16 --vary ht.KC=2.75e-4 --methods scipy --out /tmp/lyopronto-pr11-review-scipy.jsonl --forcepassed and produced native JSON booleans,hash.inputs, no trajectory payload, andproduct_temp_ok=true.python benchmarks/grid_cli.py generate --task Tsh --scenario baseline --vary product.A1=16 --vary ht.KC=2.75e-4 --methods fd --n-elements 4 --out /tmp/lyopronto-pr11-review-fd.jsonl --forcepassed and produced native JSON booleans,hash.inputs, no trajectory payload, andproduct_temp_ok=true.python -m pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_modelspassed: 241 passed.python -m pytest tests/ -n auto -v -m "notebook" --ignore=tests/test_pyomo_models --cov=lyopronto --cov-report=xml:/tmp/pr11-notebook-coverage.xml --cov-report=term-missingcould not complete locally because this sandbox disallows Jupyter kernel socket creation (PermissionError: [Errno 1] Operation not permitted). The GitHubdoctestscheck on head070c68276191fbb6670ab97754f9bcf09e2338effailed in the same workflow before notebook execution due toModuleNotFoundError: No module named 'benchmarks'while collectingtests/test_benchmarks.py.
The PR should not be merged as-is. I would not merge this until the blocking issue above is addressed.
I attempted to submit this as REQUEST_CHANGES, but GitHub rejected that state because this account is the PR author, so I am submitting the review as COMMENT.
|
Pushed commits:
Main changes made:
Tests run:
Comments intentionally not addressed:
Remaining risks or follow-up:
|
33f6c28 to
1fe6d95
Compare
This PR adds the benchmarking infrastructure for comparing scipy and Pyomo solvers: benchmarks/ directory contents: - adapters.py: Adapter classes for scipy and Pyomo solvers - grid_cli.py: CLI tool for running grid comparison benchmarks - scenarios.py: Test scenario definitions (standard, high resistance, etc.) - schema.py: Result data schema for consistent output - validate.py: Validation utilities for comparing results - grid_analysis.ipynb: Jupyter notebook for analyzing benchmark results - README.md: Documentation for running benchmarks benchmarks/results/: - .gitignore: Ignore large result files - README.md: Documentation for result files - archive/: Historical benchmark results - baseline_*.jsonl: Baseline comparison results Usage: python -m benchmarks.grid_cli --scenario baseline_Tsh --grid 3x3 python -m benchmarks.grid_cli --scenario baseline_Pch --grid 5x5 --solver pyomo This enables systematic comparison of scipy and Pyomo optimizer performance across different scenarios and parameter grids.
…chmarks # Conflicts: # lyopronto/pyomo_models/optimizers.py
bernalde
left a comment
There was a problem hiding this comment.
- Blocking issues: None.
- Nonblocking issues:
- The tracked benchmark summary JSONL files are stale relative to the current metrics schema.
- The analysis notebook reports 100% success without checking record-level failure flags.
- Questions: None.
- Tests run and outcomes:
- Fetched base
pr/pyomo-optimizersand headpr/benchmarks; merge-base diff matched GitHub totals: 87 files, 4,813 additions, 275 deletions. python -m ruff check benchmarks tests/test_benchmarks.py: passed.PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/test_benchmarks.py -v: 17 passed.PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models: 257 passed.PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/ -n auto -v -m "notebook" --ignore=tests/test_pyomo_models: 2 passed after rerunning outside the sandbox socket restriction.PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/test_pyomo_models/test_optimizer_framework.py -v: 20 passed.- CLI help checks and a one-case SciPy
grid_cli.py generatesmoke run: passed.
- Merge-readiness: No blocking issues found. Since I am authenticated as
bernalde, the PR author, I am submitting this as COMMENT; any formal approval must come from another maintainer.
| @@ -0,0 +1,27 @@ | |||
| {"timestamp":"2026-05-08T12:42:03.725801Z","environment":{"python":"3.13.5","platform":"Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39","numpy":"2.4.2","pyomo":"6.9.5"},"version":2,"task":"Tsh","scenario":"baseline","params":{"product.A1":16.0,"ht.KC":0.000275},"grid":{"param1":{"path":"product.A1","value":16.0},"param2":{"path":"ht.KC","value":0.000275}},"scipy":{"success":true,"wall_time_s":13.709681862965226,"objective_time_hr":12.193457634926368,"solver":{"status":"n/a","termination_condition":"n/a"},"metrics":{"n_points":1221,"final_percent_dried":100.0,"max_Tbot":-24.99999999999993,"monotonic_dried":true,"tsh_bounds_ok":true,"pch_positive":true,"flux_nonnegative":true,"dryness_target_met":true,"product_temp_ok":true,"product_critical_temp":-25.0}},"pyomo":null,"failed":false,"hash.inputs":"272cfbac930314f1","hash.record":"d24b9be03fc7c3f6"} | |||
There was a problem hiding this comment.
Nonblocking: These tracked reference summaries look stale relative to the current compute_residuals() schema. A fresh one-case CLI run now emits dryness_target_percent, final_dryness_shortfall_percent, max_product_temp_violation_C, and the ramp metric keys, but both checked-in summary files omit them. Because the README and notebook use these as reference data, please regenerate both summaries with the current CLI or update the docs/tests to define them as legacy samples.
There was a problem hiding this comment.
Addressed in d11a211. Regenerated both tracked summary JSONL files with the current CLI schema and added test_tracked_reference_summaries_use_current_metric_schema to prevent the reference files from drifting again.
| "print(\n", | ||
| " f\"- Pyomo methods are {df['fd_speedup'].mean():.1f}\u00d7 faster on average (simultaneous vs sequential)\"\n", | ||
| ")\n", | ||
| "print(f\"- All {len(grid_recs)} runs converged successfully (100% success rate)\")\n", |
There was a problem hiding this comment.
Nonblocking: This prints 100% success unconditionally, even though grid records now carry failed, scipy.success, and pyomo.success. If a benchmark file contains solver failures or post-solve validation failures, the notebook will still report all runs converged. Please compute the success rate from the record flags and list failed parameter combinations.
There was a problem hiding this comment.
Addressed in d11a211. Added benchmarks.analysis success-summary helpers, updated the notebook to compute the success rate from record flags and list failed parameter combinations, and covered the behavior with test_success_summary_uses_record_failure_flags.
|
Summary
Systematic comparison framework for scipy vs Pyomo optimizers. CLI tool (
grid_cli.py) generates parameter sweeps across product resistance, heat transfer coefficients, and operating conditions. Compares objective values, solve times, and constraint satisfaction. Includes Jupyter notebook for visualization and analysis.PR 6 of 7 in the Pyomo integration series.
Changes
benchmarks/adapters.py— Scipy/Pyomo adapter layerbenchmarks/grid_cli.py— CLI for running benchmark gridsbenchmarks/scenarios.py— Predefined test scenariosbenchmarks/schema.py— Result schema definitionbenchmarks/validate.py— Result validationbenchmarks/grid_analysis.ipynb— Analysis notebookbenchmarks/README.md— Benchmark documentationbenchmarks/results/— Reference results and .gitignoreUsage
PR Chain
Testing
No new pytest tests — benchmarks are standalone scripts. Results directory is gitignored.