Skip to content

Compare paper OCP against LyoPRONTO baselines#35

Merged
bernalde merged 18 commits into
fix/issue-29-problem-2-ocpfrom
fix/issue-30-baseline-comparison
Jun 3, 2026
Merged

Compare paper OCP against LyoPRONTO baselines#35
bernalde merged 18 commits into
fix/issue-29-problem-2-ocpfrom
fix/issue-30-baseline-comparison

Conversation

@bernalde
Copy link
Copy Markdown
Member

@bernalde bernalde commented May 11, 2026

Summary

  • Add Issue Compare paper-reference OCP results against LyoPRONTO baselines #30 benchmark notes comparing the paper-reference direct-transcription OCPs against current LyoPRONTO quasi-steady baselines.
  • Include a clear comparison table covering drying time, active constraints, controls, and temperature behavior.
  • Make the recommendation explicit: keep the paper model validation-only and route LyoPRONTO-facing adapter work through Add experimental LyoPRONTO policy OCP adapter #31.
  • Add a regression test that validates the documented comparison values against the tracked benchmark summaries.
  • Include Problem 2 (Add Paper Problem 2 flux-constrained OCP benchmark #29) follow-ups now present on this branch: initialize_paper_problem_from_trajectory is the problem-neutral initializer, initialize_paper_problem1_from_trajectory remains a backward-compatible alias, exports/docs use the neutral name, and tests cover the Problem 2 interface-velocity constraint skip plus refined n_z=20 classification behavior.

Tests run

  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_benchmarks.py::test_objective_time_stats_reports_empty_method tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison -q - passed, 2 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_benchmarks.py -q - passed, 21 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/ -m "not notebook and not pyomo and not slow" -q - passed, 252 tests.
  • python -m ruff check tests/test_benchmarks.py - passed.
  • python -m ruff format --check tests/test_benchmarks.py - passed, 1 file already formatted.
  • python -m mypy lyopronto - passed; mypy emitted the existing warning that python_version = 3.8 is no longer supported by this mypy version.
  • git diff --check - passed.

Notes about tests not run

  • Full Pyomo, notebook, and slow tests were not rerun for the final benchmark-helper change. Earlier Problem 2 review follow-up validation ran the targeted fast Pyomo/model-doc tests and the refined n_z=20 slow Problem 2 validation.
  • A plain targeted pytest invocation without PYTHONPATH=. failed during collection in this shell with ModuleNotFoundError: No module named 'benchmarks'; the successful commands above set the repository root on PYTHONPATH.

Closes #30

Copy link
Copy Markdown
Member Author

@bernalde bernalde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking issues: none.

Nonblocking issues: none.

Questions: none.

Tests run and outcomes:

  • Verified documented benchmark ranges against benchmarks/results/baseline_Tsh_3x3_summary.jsonl and benchmarks/results/baseline_Pch_3x3_summary.jsonl with jq; the documented ranges match the committed data.
  • python -m ruff check tests/test_benchmarks.py passed.
  • python -m ruff format --check tests/test_benchmarks.py passed.
  • python -m pytest tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison -q passed, 1 test.
  • python -m pytest tests/test_benchmarks.py -q passed, 18 tests.
  • python -m pytest tests/ -m "not notebook and not pyomo and not slow" -q passed, 247 tests.
  • python -m pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models -q passed, 262 tests.

I did not run the full Pyomo or notebook suites locally because this PR changes benchmark notes and a lightweight documentation guard test; the CI-style non-Pyomo suite passed, and the stacked PR's visible GitHub checks (Deploy docs, doctests) are green.

Merge recommendation: merge as-is.

@bernalde bernalde force-pushed the fix/issue-29-problem-2-ocp branch from 1788e7f to d34032d Compare May 30, 2026 22:12
@bernalde bernalde force-pushed the fix/issue-30-baseline-comparison branch from 9e0629f to 8db5985 Compare May 30, 2026 22:13
@bernalde
Copy link
Copy Markdown
Member Author

bernalde commented Jun 3, 2026

Carrying forward the review of #34, which was merged into pr/benchmarks before the review could gate it. Since this PR (#35) builds on fix/issue-29-problem-2-ocp, the Problem 2 code those comments referenced is now in this PR's base. None of the findings were blocking, so the merge is not a correctness problem, but the follow-ups below remain valid and are tracked here so they aren't lost. (The original inline comments are on #34 at lines paper_ocp.py:1357, :1778, :1988.)

Tests at the time, on a clean checkout of the #34 head: test_paper_ocp.py -m "not slow" 20 passed; the slow Problem 2 solve test passed; full tests/test_pyomo_models/ 201 passed / 4 skipped / 1 xfailed; ruff check and format clean. No regressions.

Blocking: none.

Nonblocking:

  • initialize_paper_problem1_from_trajectory is now the shared initializer for both Problem 1 and Problem 2 but keeps a Problem-1-specific name. Consider renaming to a problem-agnostic name (e.g. initialize_paper_problem_from_trajectory) with a thin backward-compatible alias, since it is exported in __init__.

Questions:

  • The interface-velocity path constraint skips only the first collocation point (m.t.first()). Was single-point skipping chosen deliberately over skipping the first finite element, and is it expected to remain feasible under the deferred n_z=20 refinement? The coarse n_z=5 solve confirms post-initial max violation <= 5e-10, but a finer mesh may place the second collocation point inside the documented initial excursion.
  • Policy 1 vs Policy 3 disambiguation in classify_paper_policies relies on the solved interface velocity dropping below the limit (minus velocity_tolerance, default 2e-9) during the max-heat segment. Confirm this holds on refined meshes; otherwise a max-heat segment could be mislabeled Policy 3.

Copy link
Copy Markdown
Member Author

@bernalde bernalde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review as maintainer. Note: I am the PR author, so GitHub only permits a COMMENT here; the formal approval or change request must come from another maintainer. There are no blocking issues.

Scope verified: diff against merge-base 5f2c3f7 matches the PR totals (2 files, +61 / -0). Base branch is fix/issue-29-problem-2-ocp (stacked PR), head is fix/issue-30-baseline-comparison. The change is documentation-only (a Paper-vs-LyoPRONTO baseline comparison section in docs/PAPER_OCP_VALIDATION.md) plus one docs-regression test in tests/test_benchmarks.py.

Claims verified by running, not just reading. The doc quotes drying-time ranges attributed to the committed baseline summaries; I parsed those files (benchmarks/results/baseline_Tsh_3x3_summary.jsonl and baseline_Pch_3x3_summary.jsonl, both tracked, pre-existing in the base, not part of this diff) and every quoted figure matches exactly:

  • Tsh SciPy 12.19-14.47 h, mean 13.33 h - matches.
  • Tsh Pyomo FD 12.40-14.76 h, mean 13.58 h - matches.
  • Tsh Pyomo collocation 12.00-14.23 h, mean 13.11 h - matches.
  • Pch SciPy 18.12-23.67 h, FD 18.15-23.71 h, collocation 17.90-23.37 h - all match.
    The documented grid (product.A1=16,18,20; ht.KC=2.75e-4,3.3e-4,4.0e-4; 9 combos) matches the data, and the documented grid_cli.py generate invocation uses only flags that benchmarks/grid_cli.py actually supports (--task, --scenario, --vary, --methods, --n-elements, --n-collocation, --out). The -25 degC product limit, 98.9% dryness target, 243 K (Problem 1) and 240 K (Problem 2) limits, and the 273 K Problem 1 shelf max all match the data and config.

Issue intent (#30): the PR links Closes #30. This is an analysis/documentation issue and the link is appropriate - all three acceptance criteria are met: (1) benchmark notes include a clear comparison table; (2) the recommendation is explicit (keep the paper model validation-only and adapt policy constraints into LyoPRONTO); (3) adapter requirements are converted into an implementation issue (#31, already in progress as #36). The four work items (define comparable scenarios, run scipy/Pyomo where comparable, report drying time/constraints/controls/temperature, and separate formulation vs transcription differences) are all addressed. Closes #30 will correctly auto-close on merge.

Tests I ran on a clean checkout of the PR head:

  • PYTHONPATH=. pytest tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison: 1 passed.
  • PYTHONPATH=. pytest tests/test_benchmarks.py: 20 passed (the PR description's 18 predates tests added since).
  • ruff check and ruff format --check on tests/test_benchmarks.py: clean.
    The PYTHONPATH=. requirement noted in the PR description reproduces: without it, collection fails with ModuleNotFoundError: No module named 'benchmarks'. That is a pre-existing repo-layout quirk, not introduced here.
  1. Blocking issues: none.

  2. Nonblocking issues:

  • The new regression test asserts only that literal strings (the full table header and exact numeric ranges such as "12.19-14.47 h") are present in the doc (inline comment). It does not validate those numbers against the JSONL summaries, so if the baselines are regenerated and the figures shift, the doc and data can silently drift while the test still passes. Consider parsing the committed summaries and asserting the documented ranges against them, so the test guards correctness rather than mere presence.
  1. Questions: none.

  2. Tests run and outcomes: targeted docs test (1 passed), tests/test_benchmarks.py (20 passed), ruff clean. All green; no genuine failures.

  3. Merge-readiness: Ready on the merits. Documentation-only change with all quantitative claims independently verified against the committed data, acceptance criteria met, and a passing guard test. The single nonblocking item does not gate merge. It targets fix/issue-29-problem-2-ocp, so it should land there as that stack reaches main. A second maintainer must provide the formal approval since I authored this PR.

Comment thread tests/test_benchmarks.py Outdated
in notes
)
assert "benchmarks/results/baseline_Tsh_3x3_summary.jsonl" in notes
assert "12.19-14.47 h" in notes
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nonblocking: this test guards only that the literal strings (table header, exact ranges like "12.19-14.47 h") remain present in the doc; it does not check them against benchmarks/results/baseline_Tsh_3x3_summary.jsonl. I verified all quoted ranges match the data today, but if the baselines are regenerated the doc and data could drift while this test stays green. Consider parsing the committed summaries (objective_time_hr grouped by scipy / pyomo discretization.method) and asserting min/max/mean against the documented figures, so the test validates correctness rather than presence.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 7aabde6. The test now computes the documented ranges from the tracked benchmark summaries and checks the markdown values against those computed stats. Relevant test: tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison.

@bernalde
Copy link
Copy Markdown
Member Author

bernalde commented Jun 3, 2026

Commits pushed:

  • 7aabde6 Validate paper OCP notes against benchmark data

Main changes made:

  • Addressed the unresolved inline review thread on tests/test_benchmarks.py.
  • Updated test_paper_ocp_notes_cover_issue30_comparison so it parses the tracked baseline_Tsh_3x3_summary.jsonl and baseline_Pch_3x3_summary.jsonl files, computes min/max/mean objective times by method, and verifies the documented table/paragraph values against those computed summaries.

Tests run and results:

  • PYTHONPATH=. pytest tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison -q: passed, 1 test.
  • PYTHONPATH=. pytest tests/test_benchmarks.py -q: passed, 20 tests.
  • PYTHONPATH=. pytest tests/ -m "not notebook and not pyomo and not slow" -q: passed, 251 tests.
  • python -m ruff check tests/test_benchmarks.py: passed.
  • python -m ruff format --check tests/test_benchmarks.py: passed.
  • python -m mypy lyopronto: passed; mypy emitted the existing warning that python_version = 3.8 is no longer supported by this mypy version.
  • git diff --check: passed.

Comments intentionally not addressed:

Remaining risks or follow-up items:

  • Full Pyomo, notebook, and slow suites were not rerun for this docs/test-only update.
  • The carried-forward Problem 2 API/model follow-ups remain open for a future PR.

@bernalde
Copy link
Copy Markdown
Member Author

bernalde commented Jun 3, 2026

Commits pushed:

  • 36250bb Address Problem 2 review follow-ups

Main changes made:

  • Addressed the carried-forward Problem 2 comment from Add Paper Problem 2 OCP benchmark #34.
  • Added initialize_paper_problem_from_trajectory as the problem-neutral initializer and kept initialize_paper_problem1_from_trajectory as a backward-compatible alias.
  • Updated internal solve paths, package exports, README docs, and export tests for the new initializer name.
  • Added fast coverage that the Problem 2 interface-velocity constraint skips only the initial collocation point.
  • Added a slow n_z=20 Problem 2 validation that checks terminal feasibility, post-initial interface-velocity feasibility, and the expected Policy 3 -> Policy 1 -> Policy 2 classification.
  • Updated PAPER_OCP_VALIDATION.md to document the initial-point skip and refined n_z=20 validation.

Tests run and results:

  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_pyomo_models/test_init.py tests/test_pyomo_models/test_paper_ocp.py tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison -m "not slow" -q: passed, 27 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_pyomo_models/test_paper_ocp.py::test_problem2_nz20_solve_keeps_velocity_feasible_and_classifies_policy -q: passed, 1 test.
  • PYTHONPATH=. pytest tests/ -m "not notebook and not pyomo and not slow" -q: passed, 251 tests.
  • python -m ruff check lyopronto/pyomo_models/paper_ocp.py lyopronto/pyomo_models/__init__.py tests/test_pyomo_models/test_paper_ocp.py tests/test_pyomo_models/test_init.py: passed.
  • python -m ruff format --check lyopronto/pyomo_models/paper_ocp.py lyopronto/pyomo_models/__init__.py tests/test_pyomo_models/test_paper_ocp.py tests/test_pyomo_models/test_init.py: passed.
  • python -m mypy lyopronto: passed; mypy emitted the existing warning that python_version = 3.8 is no longer supported by this mypy version.
  • git diff --check: passed.

Comments intentionally not addressed:

  • None from the carried-forward comment.

Remaining risks or follow-up items:

  • Full notebook tests and the full slow Pyomo suite were not rerun; this update ran the new refined Problem 2 slow validation plus the targeted fast Pyomo/model-doc tests.
  • MATLAB/GEKKO upstream export for Problem 2 remains a documented future validation artifact.

Copy link
Copy Markdown
Member Author

@bernalde bernalde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review as maintainer. Note: I am the PR author, so GitHub only permits a COMMENT here; the formal approval must come from another maintainer. I found no blocking, nonblocking, or question-level issues introduced by this PR.

Scope verified: fetched fix/issue-29-problem-2-ocp and fix/issue-30-baseline-comparison; local diff against merge-base 5f2c3f7 matches GitHub PR totals exactly: 7 files changed, +253 / -20. Current PR head reviewed: 36250bb1b81a6d73fcf4dfd28a6aa544a2e5dc1b.

Issue intent (#30): the Closes #30 link is appropriate. The PR defines comparable scenarios, reports drying time / constraints / controls / temperature behavior, separates paper-vs-LyoPRONTO formulation differences from transcription effects, gives an explicit recommendation to keep the paper model validation-only while adapting policy constraints into LyoPRONTO, and points adapter work to #31.

  1. Blocking issues: none.

  2. Nonblocking issues: none.

  3. Questions: none.

  4. Tests run and outcomes:

  • PYTHONPATH=. pytest tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison -q: passed, 1 test.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_pyomo_models/test_init.py tests/test_pyomo_models/test_paper_ocp.py -m "not slow" -q: passed, 26 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_pyomo_models/test_paper_ocp.py::test_problem2_nz20_solve_keeps_velocity_feasible_and_classifies_policy -q: passed, 1 test.
  • PYTHONPATH=. pytest tests/ -n auto -v -m "not notebook and not pyomo" --ignore=tests/test_pyomo_models -q: passed, 266 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_pyomo_models -q: passed, 203 passed / 4 skipped / 1 xfailed.
  • python -m ruff check docs/PAPER_OCP_VALIDATION.md lyopronto/pyomo_models/README.md lyopronto/pyomo_models/__init__.py lyopronto/pyomo_models/paper_ocp.py tests/test_benchmarks.py tests/test_pyomo_models/test_init.py tests/test_pyomo_models/test_paper_ocp.py: passed.
  • python -m ruff format --check lyopronto/pyomo_models/__init__.py lyopronto/pyomo_models/paper_ocp.py tests/test_benchmarks.py tests/test_pyomo_models/test_init.py tests/test_pyomo_models/test_paper_ocp.py: passed.
  • python -m mypy lyopronto: passed; mypy emitted the existing warning that configured python_version = 3.8 is unsupported by this mypy version.
  1. Merge-readiness: ready on the merits. Because I authored the PR, this COMMENT is not a formal approval; another maintainer should provide the approving review.

Copy link
Copy Markdown
Member Author

@bernalde bernalde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review of the updated head 36250bb (the PR grew from 2 files / +61 to 7 files / +253 -20 since my prior review). Note: I am the PR author, so GitHub only permits a COMMENT here; the formal approval or change request must come from another maintainer. There are no blocking issues.

Scope verified: diff against merge-base 5f2c3f7 matches the PR totals (7 files, +253 / -20). Base is fix/issue-29-problem-2-ocp, head is fix/issue-30-baseline-comparison.

The two new commits address prior review feedback well:

  • "Validate paper OCP notes against benchmark data" rewrites test_paper_ocp_notes_cover_issue30_comparison to parse the committed JSONL summaries and assert the documented table cells/ranges against computed min/max/mean, instead of matching hardcoded literals. This resolves the doc/data drift nonblocking point from my previous review.
  • "Address Problem 2 review follow-ups" renames initialize_paper_problem1_from_trajectory to a problem-agnostic initialize_paper_problem_from_trajectory with a backward-compatible alias (both exported, both in test_init's required set), adds a clarifying comment on the velocity-constraint skip, and adds a refined n_z=20 slow Problem 2 test that asserts velocity stays below the active threshold during the Policy 1 max-heat segment. This resolves the initializer-naming nonblocking and the Policy 1 vs Policy 3 disambiguation question carried from the #34 review.

Verified by running, not just reading:

  • The data-driven benchmarks test passes, so the documented Tsh ranges (SciPy 12.19-14.47 mean 13.33; FD 12.40-14.76 mean 13.58; colloc 12.00-14.23 mean 13.11) and Pch ranges match the success-filtered objective_time_hr in the committed summaries.
  • The new slow test test_problem2_nz20_solve_keeps_velocity_feasible_and_classifies_policy actually solves (IPOPT via SolverFactory fallback; idaes absent), 0.72s call, and passes including the assertion that interface velocity < limit - 2e-9 across the Policy 1 segment. The PR description's "tests not run" note flags that slow/Pyomo tests were not run locally, so I ran them: all 5 paper_ocp slow tests pass.
  • The rename has no stale references elsewhere in the repo (grep across py/md/ipynb outside the module and tests is clean).
  • Docs updated consistently: the First-Pass Tolerances section now says "slow tests" and documents the n_z=20 confirmation; the skip rationale is clarified ("only at the initial collocation point, not the first finite element"); the n_z=20 item is removed from Known Limitations now that it is implemented.

Tests I ran on a clean checkout of the head:

  • PYTHONPATH=. pytest tests/test_benchmarks.py: 20 passed.
  • PYTHONPATH=. pytest tests/test_pyomo_models/test_paper_ocp.py tests/test_pyomo_models/test_init.py -m "not slow": 26 passed.
  • PYTHONPATH=. pytest tests/test_pyomo_models/test_paper_ocp.py -m slow: 5 passed (incl. the new n_z=20 solve).
  • ruff check and ruff format --check on the changed module and tests: clean.

Issue intent (#30): Closes #30 remains appropriate. The three acceptance criteria (comparison table, explicit recommendation, adapter requirements converted to #31) are met, and the comparison numbers are now machine-checked against the data.

  1. Blocking issues: none.

  2. Nonblocking issues:

  • Scope/description gap (inline comment on the rename). The PR description and Closes #30 describe only the #30 baseline-comparison docs, but the head also carries #29 Problem 2 code changes: a public function rename with a backward-compatible alias, a constraint comment, and two new tests including a slow n_z=20 solve. A public-API rename should be called out so the squash-merge message and any changelog are accurate. Fix: extend the PR description to list the Problem 2 follow-ups and the rename/alias, or split them into a #29 follow-up PR.
  • Test-helper robustness (inline comment in tests/test_benchmarks.py): _objective_time_stats computes min/max/mean over success-filtered runs; if a future baseline regeneration yields zero successful runs for a method, min([])/max([]) raise a bare ValueError rather than a clear assertion. A small guard would give a clearer failure.
  1. Questions: none. The disambiguation question from the prior review is now answered by the n_z=20 test.

  2. Tests run and outcomes: benchmarks (20 passed), paper_ocp + init fast (26 passed), paper_ocp slow (5 passed incl. new n_z=20), ruff clean. All green; no genuine failures.

  3. Merge-readiness: Ready on the merits. Prior feedback is addressed, new behavior is tested and verified, and the documented numbers are machine-checked against the committed data. The two nonblocking items do not gate merge. It targets fix/issue-29-problem-2-ocp, so it should land there as the stack reaches main. A second maintainer must provide the formal approval since I authored this PR.



def initialize_paper_problem1_from_trajectory(
def initialize_paper_problem_from_trajectory(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nonblocking (scope/description): this is a public-API rename (initialize_paper_problem1_from_trajectory -> initialize_paper_problem_from_trajectory, with a backward-compatible alias below, both exported in init). The rename and the accompanying Problem 2 test additions are not mentioned in the PR description, which describes only the #30 baseline-comparison docs. Please list these Problem 2 (#29) follow-ups in the description, or split them into a separate #29 follow-up PR, so the squash-merge commit and any changelog accurately reflect the public-API change.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by updating the PR description to list the Problem 2 (#29) public API rename, compatibility alias, exports/docs updates, and Problem 2 test additions. The related code changes are already in 36250bb; this fix was PR metadata.

Comment thread tests/test_benchmarks.py

return {
method: {
"min": min(method_values),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nonblocking (robustness): min(method_values) / max(method_values) raise a bare ValueError if a method ends up with zero successful runs after a future baseline regeneration. Consider asserting each method has at least one successful run first (e.g. assert method_values, f"no successful {method} runs in {summary_path}") so the failure is self-explanatory rather than an empty-sequence error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 46aacd8. _objective_time_stats now asserts each method has at least one successful run before min/max/mean, and test_objective_time_stats_reports_empty_method covers the failure message.

@bernalde
Copy link
Copy Markdown
Member Author

bernalde commented Jun 3, 2026

Commits pushed:

  • 46aacd8 Clarify empty benchmark summary failures

Main changes made:

  • Addressed the benchmark robustness inline thread by adding test_objective_time_stats_reports_empty_method.
  • Updated _objective_time_stats so each method bucket must contain at least one successful run before min/max/mean are computed.
  • Updated the PR description to list the Problem 2 (Add Paper Problem 2 flux-constrained OCP benchmark #29) public API rename, compatibility alias, exports/docs updates, and Problem 2 test additions already present on this branch.

Tests run and results:

  • Pre-fix MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_benchmarks.py::test_objective_time_stats_reports_empty_method -q: failed as expected with ValueError: min() iterable argument is empty.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_benchmarks.py::test_objective_time_stats_reports_empty_method tests/test_benchmarks.py::test_paper_ocp_notes_cover_issue30_comparison -q: passed, 2 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/test_benchmarks.py -q: passed, 21 tests.
  • MPLCONFIGDIR=/tmp/matplotlib-cache PYTHONPATH=. pytest tests/ -m "not notebook and not pyomo and not slow" -q: passed, 252 tests.
  • python -m ruff check tests/test_benchmarks.py: passed.
  • python -m ruff format --check tests/test_benchmarks.py: passed.
  • python -m mypy lyopronto: passed; mypy emitted the existing warning that python_version = 3.8 is no longer supported by this mypy version.
  • git diff --check: passed.

Comments intentionally not addressed:

  • None. The older outdated benchmark-doc thread was already addressed and replied to in 7aabde6, so I did not duplicate that inline reply.

Remaining risks or follow-up items:

  • Full Pyomo, notebook, and slow suites were not rerun for the final benchmark-helper change.
  • gh pr edit failed with an unrelated GitHub GraphQL projectCards deprecation error; the PR body was updated successfully through gh api PATCH instead.

… fix/issue-30-baseline-comparison

# Conflicts:
#	docs/PAPER_OCP_VALIDATION.md
@bernalde bernalde merged commit 32b80f1 into fix/issue-29-problem-2-ocp Jun 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant