Skip to content

feat: add P2.8 ModelCompareAgent for multi-model comparison#4

Merged
Mola-maker merged 1 commit intomainfrom
feat/p2.8-model-compare
Apr 20, 2026
Merged

feat: add P2.8 ModelCompareAgent for multi-model comparison#4
Mola-maker merged 1 commit intomainfrom
feat/p2.8-model-compare

Conversation

@Mola-maker
Copy link
Copy Markdown
Owner

New pipeline stage between P2 and P2.5 that ranks candidate models (primary + alternatives + optional user-seeded list) along two dimensions:

  • cheap static metrics: equation count, variable count, constraint count, has_latex, has_solution_method
  • LLM-judged scoring across rigor / feasibility / fit / score_pot

Writes results to modeling.comparison_v2 and sets
modeling.selected_model_id without touching modeling.primary_model, so downstream stages can opt in.

Degrades gracefully:

  • 0 candidates -> method='skipped', pipeline continues
  • 1 candidate -> method='single', no call_model invocation
  • LLM failure -> method='metrics_only', deterministic fallback ranking by (eq+var+constraint + latex/solve bonuses)

Registered as on_error='skip' so P2.8 can never block the pipeline; added to --start choices.

New pipeline stage between P2 and P2.5 that ranks candidate models
(primary + alternatives + optional user-seeded list) along two
dimensions:

  - cheap static metrics: equation count, variable count, constraint
    count, has_latex, has_solution_method
  - LLM-judged scoring across rigor / feasibility / fit / score_pot

Writes results to modeling.comparison_v2 and sets
modeling.selected_model_id without touching modeling.primary_model,
so downstream stages can opt in.

Degrades gracefully:
  - 0 candidates -> method='skipped', pipeline continues
  - 1 candidate  -> method='single', no call_model invocation
  - LLM failure  -> method='metrics_only', deterministic fallback
    ranking by (eq+var+constraint + latex/solve bonuses)

Registered as on_error='skip' so P2.8 can never block the pipeline;
added to --start choices.
@Mola-maker Mola-maker merged commit ee56caa into main Apr 20, 2026
0 of 3 checks passed
@Mola-maker Mola-maker deleted the feat/p2.8-model-compare branch April 20, 2026 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant