PR 6/7: Benchmarking infrastructure for scipy vs Pyomo by bernalde · Pull Request #11 · SECQUOIA/LyoPRONTO

bernalde · 2026-03-03T00:42:35Z

Summary

Systematic comparison framework for scipy vs Pyomo optimizers. CLI tool (grid_cli.py) generates parameter sweeps across product resistance, heat transfer coefficients, and operating conditions. Compares objective values, solve times, and constraint satisfaction. Includes Jupyter notebook for visualization and analysis.

PR 6 of 7 in the Pyomo integration series.

Changes

benchmarks/adapters.py — Scipy/Pyomo adapter layer
benchmarks/grid_cli.py — CLI for running benchmark grids
benchmarks/scenarios.py — Predefined test scenarios
benchmarks/schema.py — Result schema definition
benchmarks/validate.py — Result validation
benchmarks/grid_analysis.ipynb — Analysis notebook
benchmarks/README.md — Benchmark documentation
benchmarks/results/ — Reference results and .gitignore

Usage

# Run default benchmark
python benchmarks/grid_cli.py

# Run specific scenario
python benchmarks/grid_cli.py --scenario sucrose_5pct

PR Chain

#	PR	Status
0	Sync upstream (#5)	Merged
1	CI/CD for Pyomo (#6)	Open
2	Pyomo dependencies (#7)	Open
3	Utils & single-step (#8)	Open
4	Multi-period model (#9)	Open
5	Optimizer functions (#10)	Open
6	Benchmarks (this PR)
7	Docs & examples	Pending

Testing

No new pytest tests — benchmarks are standalone scripts. Results directory is gitignored.

review-notebook-app · 2026-03-03T00:42:41Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

bernalde

Checked the PR #7 follow-up comments against this stacked PR. One copyright-header issue carries forward.

bernalde

Review summary:

Blocking issues:

Documented CLI invocation fails in script mode.
Benchmark metrics serialize dryness_target_met as a JSON string instead of a boolean.
Tracked reference result files include full trajectory data and add about 27 MB of generated data, contrary to the stated policy.
The analysis notebook defaults to a missing, gitignored input file.

Nonblocking issues:

README documents Make shortcuts that do not exist.
The CLI eagerly requires optional Pyomo dependencies even for help/scipy-only use.
hash.inputs is documented but not emitted by generated records.

Questions: None.

Tests run and outcomes:

python benchmarks/grid_cli.py --help failed with ModuleNotFoundError: No module named 'benchmarks'.
python -m benchmarks.grid_cli --help passed.
python -m benchmarks.grid_cli generate --task Tsh --scenario baseline --vary product.A1=16 --vary ht.KC=2.75e-4 --methods scipy --out /tmp/lyopronto-pr11-scipy-smoke.jsonl --force passed and confirmed the serialized metric type issue.
python -m ruff check benchmarks passed.
python -m pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models passed: 237 passed in 577.15s. An initial run from a hyphenated temporary worktree failed during collection because this repo has a top-level __init__.py with a relative import; rerunning from /tmp/LyoPRONTO_pr11_review avoided that path artifact.

I would not merge this until the blocking issues above are addressed.

bernalde · 2026-05-08T06:05:16Z

ReviewNB bot comment: Not addressed; this was an informational notebook-diff link and did not request a repository change.

bernalde

Review summary:

Blocking issues:

The updated PR fixes the earlier review comments, but the new benchmark test module breaks the GitHub doctests workflow during collection because benchmarks is not importable in the installed CI environment.

Nonblocking issues: None.

Questions: None.

Tests run and outcomes:

python -m ruff check benchmarks tests/test_benchmarks.py passed.
python -m pytest tests/test_benchmarks.py -v passed: 4 passed.
python -c 'import json; json.load(open("benchmarks/grid_analysis.ipynb")); print("notebook json ok")' passed.
python benchmarks/grid_cli.py --help passed.
python -m benchmarks.grid_cli --help passed.
python benchmarks/grid_cli.py generate --task Tsh --scenario baseline --vary product.A1=16 --vary ht.KC=2.75e-4 --methods scipy --out /tmp/lyopronto-pr11-review-scipy.jsonl --force passed and produced native JSON booleans, hash.inputs, no trajectory payload, and product_temp_ok=true.
python benchmarks/grid_cli.py generate --task Tsh --scenario baseline --vary product.A1=16 --vary ht.KC=2.75e-4 --methods fd --n-elements 4 --out /tmp/lyopronto-pr11-review-fd.jsonl --force passed and produced native JSON booleans, hash.inputs, no trajectory payload, and product_temp_ok=true.
python -m pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models passed: 241 passed.
python -m pytest tests/ -n auto -v -m "notebook" --ignore=tests/test_pyomo_models --cov=lyopronto --cov-report=xml:/tmp/pr11-notebook-coverage.xml --cov-report=term-missing could not complete locally because this sandbox disallows Jupyter kernel socket creation (PermissionError: [Errno 1] Operation not permitted). The GitHub doctests check on head 070c68276191fbb6670ab97754f9bcf09e2338ef failed in the same workflow before notebook execution due to ModuleNotFoundError: No module named 'benchmarks' while collecting tests/test_benchmarks.py.

The PR should not be merged as-is. I would not merge this until the blocking issue above is addressed.

I attempted to submit this as REQUEST_CHANGES, but GitHub rejected that state because this account is the PR author, so I am submitting the review as COMMENT.

bernalde · 2026-05-08T12:05:07Z

Pushed commits:

828fd78 (Package benchmark tooling for CI)
Earlier review-fix commit already on this branch: 070c682 (Address benchmark review feedback)

Main changes made:

Added benchmarks/__init__.py so benchmark modules are an importable package.
Updated [tool.setuptools.packages.find] to include benchmarks*, matching the CI installed-package environment.
Verified the built wheel contains benchmarks/__init__.py, benchmarks/grid_cli.py, and benchmarks/adapters.py.

Tests run:

python -m ruff check benchmarks tests/test_benchmarks.py: passed.
python -m pytest tests/test_benchmarks.py -v: passed, 4 passed.
python -m pytest tests/ -n auto -v -m "notebook" --ignore=tests/test_pyomo_models --collect-only: passed collection, 2 selected and 241 deselected.
python -c 'from setuptools import find_packages; print(find_packages(include=["lyopronto*", "benchmarks*"]))': confirmed benchmarks is discovered.
python -m pip wheel . --no-deps --no-build-isolation -w /tmp/pr11-wheel-check: passed.
Wheel inspection confirmed benchmark package files are included.
python -m pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models: passed, 241 passed.

Comments intentionally not addressed:

The ReviewNB bot comment is informational and does not request a repository change.
Older unresolved review threads were already addressed by 070c682; no additional code change was needed for those threads in this pass.

Remaining risks or follow-up:

Full local notebook execution could not be completed in this sandbox because Jupyter kernel startup needs socket creation and fails with PermissionError: [Errno 1] Operation not permitted. The relevant CI collection failure was reproduced through GitHub logs and covered locally with notebook-marker collection plus wheel/package verification. The GitHub doctests rerun for 828fd78 is pending.

This PR adds the benchmarking infrastructure for comparing scipy and Pyomo solvers: benchmarks/ directory contents: - adapters.py: Adapter classes for scipy and Pyomo solvers - grid_cli.py: CLI tool for running grid comparison benchmarks - scenarios.py: Test scenario definitions (standard, high resistance, etc.) - schema.py: Result data schema for consistent output - validate.py: Validation utilities for comparing results - grid_analysis.ipynb: Jupyter notebook for analyzing benchmark results - README.md: Documentation for running benchmarks benchmarks/results/: - .gitignore: Ignore large result files - README.md: Documentation for result files - archive/: Historical benchmark results - baseline_*.jsonl: Baseline comparison results Usage: python -m benchmarks.grid_cli --scenario baseline_Tsh --grid 3x3 python -m benchmarks.grid_cli --scenario baseline_Pch --grid 5x5 --solver pyomo This enables systematic comparison of scipy and Pyomo optimizer performance across different scenarios and parameter grids.

…chmarks # Conflicts: # lyopronto/pyomo_models/optimizers.py

bernalde

Blocking issues: None.
Nonblocking issues:

The tracked benchmark summary JSONL files are stale relative to the current metrics schema.
The analysis notebook reports 100% success without checking record-level failure flags.

Questions: None.
Tests run and outcomes:

Fetched base pr/pyomo-optimizers and head pr/benchmarks; merge-base diff matched GitHub totals: 87 files, 4,813 additions, 275 deletions.
python -m ruff check benchmarks tests/test_benchmarks.py: passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/test_benchmarks.py -v: 17 passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models: 257 passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/ -n auto -v -m "notebook" --ignore=tests/test_pyomo_models: 2 passed after rerunning outside the sandbox socket restriction.
PYTHONPATH=/tmp/LyoPRONTO_pr11_review pytest tests/test_pyomo_models/test_optimizer_framework.py -v: 20 passed.
CLI help checks and a one-case SciPy grid_cli.py generate smoke run: passed.

Merge-readiness: No blocking issues found. Since I am authenticated as bernalde, the PR author, I am submitting this as COMMENT; any formal approval must come from another maintainer.

bernalde · 2026-05-31T16:19:34Z

@@ -0,0 +1,27 @@
+{"timestamp":"2026-05-08T12:42:03.725801Z","environment":{"python":"3.13.5","platform":"Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39","numpy":"2.4.2","pyomo":"6.9.5"},"version":2,"task":"Tsh","scenario":"baseline","params":{"product.A1":16.0,"ht.KC":0.000275},"grid":{"param1":{"path":"product.A1","value":16.0},"param2":{"path":"ht.KC","value":0.000275}},"scipy":{"success":true,"wall_time_s":13.709681862965226,"objective_time_hr":12.193457634926368,"solver":{"status":"n/a","termination_condition":"n/a"},"metrics":{"n_points":1221,"final_percent_dried":100.0,"max_Tbot":-24.99999999999993,"monotonic_dried":true,"tsh_bounds_ok":true,"pch_positive":true,"flux_nonnegative":true,"dryness_target_met":true,"product_temp_ok":true,"product_critical_temp":-25.0}},"pyomo":null,"failed":false,"hash.inputs":"272cfbac930314f1","hash.record":"d24b9be03fc7c3f6"}


Nonblocking: These tracked reference summaries look stale relative to the current compute_residuals() schema. A fresh one-case CLI run now emits dryness_target_percent, final_dryness_shortfall_percent, max_product_temp_violation_C, and the ramp metric keys, but both checked-in summary files omit them. Because the README and notebook use these as reference data, please regenerate both summaries with the current CLI or update the docs/tests to define them as legacy samples.

Addressed in d11a211. Regenerated both tracked summary JSONL files with the current CLI schema and added test_tracked_reference_summaries_use_current_metric_schema to prevent the reference files from drifting again.

bernalde · 2026-05-31T16:19:34Z

+    "print(\n",
+    "    f\"- Pyomo methods are {df['fd_speedup'].mean():.1f}\u00d7 faster on average (simultaneous vs sequential)\"\n",
+    ")\n",
+    "print(f\"- All {len(grid_recs)} runs converged successfully (100% success rate)\")\n",


Nonblocking: This prints 100% success unconditionally, even though grid records now carry failed, scipy.success, and pyomo.success. If a benchmark file contains solver failures or post-solve validation failures, the notebook will still report all runs converged. Please compute the success rate from the record flags and list failed parameter combinations.

Addressed in d11a211. Added benchmarks.analysis success-summary helpers, updated the notebook to compute the success rate from record flags and list failed parameter combinations, and covered the behavior with test_success_summary_uses_record_failure_flags.

bernalde · 2026-05-31T16:44:43Z

Commits pushed:

d11a211 (Address benchmark review comments)

Main changes made:

Added benchmarks.analysis helpers to compute record-level success from failed, scipy.success, and pyomo.success.
Updated benchmarks/grid_analysis.ipynb to report computed success rate and list failed parameter combinations instead of assuming 100% success.
Regenerated baseline_Tsh_3x3_summary.jsonl and baseline_Pch_3x3_summary.jsonl with the current CLI schema.
Added tests covering current reference-summary metric keys and success-summary behavior.

Tests run and results:

PYTHONPATH=/tmp/LyoPRONTO_pr11_fix pytest tests/test_benchmarks.py::test_tracked_reference_summaries_use_current_metric_schema tests/test_benchmarks.py::test_success_summary_uses_record_failure_flags -q: failed before the fix, then passed after the fix.
python -m ruff check benchmarks tests/test_benchmarks.py: passed.
python -m ruff format --check benchmarks/analysis.py tests/test_benchmarks.py benchmarks/grid_analysis.ipynb: passed.
python - <<'PY' ... json.loads(...): notebook JSON parse passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_fix pytest tests/test_benchmarks.py -v: 19 passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_fix MPLCONFIGDIR=/tmp/mpl-pr11-fix pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models: 259 passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_fix MPLCONFIGDIR=/tmp/mpl-pr11-fix pytest tests/ -n auto -v -m "notebook" --ignore=tests/test_pyomo_models: failed in the sandbox with PermissionError: [Errno 1] Operation not permitted during Jupyter socket creation; rerun outside the sandbox passed, 2 passed.
PYTHONPATH=/tmp/LyoPRONTO_pr11_fix MPLCONFIGDIR=/tmp/mpl-pr11-fix pytest tests/test_pyomo_models/test_optimizer_framework.py -v: 20 passed, 1 warning.
python -m mypy benchmarks/analysis.py: passed, with the installed mypy warning that python_version = 3.8 is no longer supported by that mypy version.

Comments intentionally not addressed:

None.

Remaining risks or follow-up items:

None identified.

This was referenced Mar 3, 2026

PR 7/7: Pyomo documentation, examples, and CHANGELOG #12

Merged

Testing and CI integration for LyoPronto from SECQUOIA LyoHUB/LyoPRONTO#6

Closed

bernalde force-pushed the pr/benchmarks branch from 1026770 to 7f86692 Compare March 3, 2026 00:51

bernalde force-pushed the pr/pyomo-optimizers branch 2 times, most recently from 48a565d to bd79668 Compare March 3, 2026 01:09

bernalde force-pushed the pr/benchmarks branch 3 times, most recently from 76a65e5 to fa20d81 Compare March 3, 2026 18:07

bernalde force-pushed the pr/pyomo-optimizers branch from 4367219 to f5b43a7 Compare March 3, 2026 18:07

bernalde force-pushed the pr/pyomo-optimizers branch from f5b43a7 to ad70a69 Compare April 1, 2026 20:56

bernalde force-pushed the pr/benchmarks branch from fa20d81 to 3091124 Compare April 1, 2026 20:56

bernalde force-pushed the pr/pyomo-optimizers branch from ad70a69 to f2ae241 Compare April 1, 2026 21:36

bernalde force-pushed the pr/benchmarks branch 2 times, most recently from df516f5 to d2c8ede Compare April 2, 2026 22:19

bernalde force-pushed the pr/pyomo-optimizers branch from f2ae241 to c20f8aa Compare April 2, 2026 22:19

bernalde force-pushed the pr/pyomo-optimizers branch from c20f8aa to 37d3c04 Compare May 7, 2026 21:42

bernalde force-pushed the pr/benchmarks branch from d2c8ede to 811daa1 Compare May 7, 2026 21:43

bernalde force-pushed the pr/pyomo-optimizers branch from 37d3c04 to b2fd8fc Compare May 7, 2026 22:27

bernalde force-pushed the pr/benchmarks branch 2 times, most recently from 457cd46 to 885a54d Compare May 7, 2026 22:39

bernalde commented May 7, 2026

View reviewed changes

Comment thread benchmarks/adapters.py

bernalde commented May 8, 2026

View reviewed changes

Comment thread benchmarks/validate.py

bernalde commented May 8, 2026

View reviewed changes

Comment thread benchmarks/adapters.py Outdated

bernalde force-pushed the pr/pyomo-optimizers branch from e68f7be to d363ddd Compare May 8, 2026 04:10

bernalde force-pushed the pr/benchmarks branch from 885a54d to 66dd785 Compare May 8, 2026 04:10

bernalde commented May 8, 2026

View reviewed changes

Comment thread benchmarks/adapters.py Outdated

bernalde force-pushed the pr/benchmarks branch from 66dd785 to a4541ad Compare May 8, 2026 05:26

bernalde commented May 8, 2026

View reviewed changes

Comment thread tests/test_benchmarks.py Outdated

bernalde mentioned this pull request May 8, 2026

Port benchmark artifact cleanup from superseded PR #3 #18

Closed

bernalde force-pushed the pr/pyomo-optimizers branch from 33f6c28 to 1fe6d95 Compare May 30, 2026 22:10

bernalde added 13 commits May 30, 2026 18:10

Address benchmark review feedback

ae38c3b

Package benchmark tooling for CI

da1d255

Stabilize Pch benchmark run

c0d6ebe

Add IPOPT replay validation for SciPy trajectories

94a2eb6

Gate replay validation on SciPy trajectory residuals

10044cc

Validate SciPy trajectories at every recorded point

e3985f0

Fix benchmark artifact cleanup (#18)

144a54b

Add benchmark validation and timeout guards (#15, #16)

c15bb5c

Filter expected edge-case test warnings (#15, #16)

e499648

Address benchmark validation review feedback (#15, #16)

15437b6

Add single-case benchmark runner (#17)

28aaacc

Audit edge-case warning semantics (#23)

d302a19

bernalde force-pushed the pr/benchmarks branch from 9919cfc to d302a19 Compare May 30, 2026 22:10

Merge remote-tracking branch 'origin/pr/pyomo-optimizers' into pr/ben…

fa9683e

…chmarks # Conflicts: # lyopronto/pyomo_models/optimizers.py

bernalde commented May 31, 2026

View reviewed changes

Address benchmark review comments

d11a211

bernalde merged commit 5751964 into pr/pyomo-optimizers May 31, 2026
2 checks passed

		@@ -0,0 +1,27 @@
		{"timestamp":"2026-05-08T12:42:03.725801Z","environment":{"python":"3.13.5","platform":"Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39","numpy":"2.4.2","pyomo":"6.9.5"},"version":2,"task":"Tsh","scenario":"baseline","params":{"product.A1":16.0,"ht.KC":0.000275},"grid":{"param1":{"path":"product.A1","value":16.0},"param2":{"path":"ht.KC","value":0.000275}},"scipy":{"success":true,"wall_time_s":13.709681862965226,"objective_time_hr":12.193457634926368,"solver":{"status":"n/a","termination_condition":"n/a"},"metrics":{"n_points":1221,"final_percent_dried":100.0,"max_Tbot":-24.99999999999993,"monotonic_dried":true,"tsh_bounds_ok":true,"pch_positive":true,"flux_nonnegative":true,"dryness_target_met":true,"product_temp_ok":true,"product_critical_temp":-25.0}},"pyomo":null,"failed":false,"hash.inputs":"272cfbac930314f1","hash.record":"d24b9be03fc7c3f6"}

Conversation

bernalde commented Mar 3, 2026

Summary

Changes

Usage

PR Chain

Testing

Uh oh!

review-notebook-app Bot commented Mar 3, 2026

Uh oh!

bernalde left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bernalde left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bernalde commented May 8, 2026

Uh oh!

bernalde left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bernalde commented May 8, 2026

Uh oh!

bernalde left a comment

Choose a reason for hiding this comment

Uh oh!

bernalde May 31, 2026

Choose a reason for hiding this comment

Uh oh!

bernalde May 31, 2026

Choose a reason for hiding this comment

Uh oh!

bernalde May 31, 2026

Choose a reason for hiding this comment

Uh oh!

bernalde May 31, 2026

Choose a reason for hiding this comment

Uh oh!

bernalde commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant