Skip to content

Adapt aca-model to the pylcm #361 API restructure#12

Open
hmgaudecker wants to merge 12 commits into
mainfrom
refactor/pylcm-361-api-restructure
Open

Adapt aca-model to the pylcm #361 API restructure#12
hmgaudecker wants to merge 12 commits into
mainfrom
refactor/pylcm-361-api-restructure

Conversation

@hmgaudecker
Copy link
Copy Markdown
Member

@hmgaudecker hmgaudecker commented May 22, 2026

Summary

Adapt aca-model to pylcm's #361 API restructure:

  • Adopt the public lcm/ / private _lcm/ package split and the API reorganisation.
  • Adopt the Regime two-class split, grid renames (PiecePiecewiseGridSegment, *Process classes), the FlatParams rename, and the regimeregime_id / regime_name distinction.
  • Declare distributed=True on the assets grid for multi-GPU sharding.
  • Activate the beartype claw on the package; add a claw regression test.
  • Apply the boilerplate update: hatch-vcs versioning, refreshed pre-commit hooks, expanded .gitignore.

Test plan

  • pixi run -e tests-cpu tests
  • pixi run -e type-checking ty
  • prek run --all-files

🤖 Generated with Claude Code

@hmgaudecker hmgaudecker force-pushed the refactor/pylcm-361-api-restructure branch from c1c976f to ab8b67f Compare May 22, 2026 14:47
Adopt pylcm's public `lcm/` / private `_lcm/` package split and the
accompanying API reorganisation: the `Regime` two-class split, the
grid renames (`Piece` → `PiecewiseGridSegment`, `*Process` classes),
the `FlatParams` rename, and the `regime` → `regime_id` / `regime_name`
distinction. Declare `distributed=True` on the assets grid for
multi-GPU sharding, activate the beartype claw on the package, and
apply the boilerplate update (hatch-vcs versioning, refreshed
pre-commit hooks, expanded `.gitignore`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hmgaudecker hmgaudecker force-pushed the refactor/pylcm-361-api-restructure branch from ab8b67f to 39ac270 Compare May 22, 2026 14:48
hmgaudecker and others added 11 commits May 22, 2026 20:44
The terminal `dead` regime carries only a tiny `[pref_type, assets]`
value function. Inheriting the distributed `assets` grid made its
V-array topology claim a sharded assets axis, while the solver emits a
replicated array — a mismatch that surfaces as an opaque XLA sharding
error mid-solve on multi-GPU runs. Sharding a terminal 3x24 V-array
buys nothing; declare `assets` non-distributed for `dead`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`GridConfig.subjects_batch_size_by_log_level` is a mapping from
`log_level` string (`"off"`, `"warning"`, `"progress"`, `"debug"`) to
the per-device simulate chunk size. Empty by default — the lookup
helper `GridConfig.get_subjects_batch_size(log_level)` returns 0,
matching the existing no-chunking behaviour.

`create_model` (both baseline and aca variants) gains an optional
`subjects_batch_size` keyword, forwarded to the pylcm `Model`. Callers
in aca-estimation look up the value via `grid_config.get_subjects_batch_size(log_level)`
keyed on the same `log_level` they pass to `model.simulate(...)`, so
each task automatically gets the chunk size sized for its diagnostic
budget.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
aca-model now passes `subjects_batch_size` to `Model(...)` (see
ec29319), which is a new field introduced on the
`feat/distributed-V-arrays` branch (PR #364) on top of
`refactor/phase-2-api-reorganisation` (PR #361). The CI pin still
pointed at #361, so the `pip install` pulled a pylcm without
`subjects_batch_size`, and every Model-construction test raised
`TypeError: Model.__init__() got an unexpected keyword argument`.

Re-point the CI pin to the PR #364 branch. Will move to `main` once
#361 and #364 land.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Solve's per-period `max_Q_over_a` integrand spans the full state
grid by default; on A100 even with the assets axis distributed
across 4 GPUs the working set runs against the 80 GB device limit.
Splaying the pref-type axis with a Python loop (`batch_size=1`)
shrinks the per-kernel allocation by `n_pref_types`. Default stays
`0` (single fused kernel) — the production GridConfig overrides
opt into `1` when the unsplayed kernel doesn't fit.
The assets axis is hardcoded distributed=True in regimes; pylcm's grid-init guard rejects batch_size > 0 + distributed. The default has to match that constraint or every fresh GridConfig() raises GridInitializationError at construction.
The pylcm sharding-consistency validator now requires every state to
carry the same `distributed` flag in every regime that declares it.
Restoring `assets` to the shared grid lets the dead regime's V-array
sharding match the alive regimes that transition into it.

Also drops the test that codified the workaround.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… test.

`consumption_dollars_floor` is a DAG function output; the scalar key
in `params` is `consumption_equiv_floor`. Looking up the function name
raised `KeyError` before the constraint check ran.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…fixes.

Picks up the validated cross-grid (`health_trans_cross`) and same-grid
pre-65 (`health_trans_pre65`) transition matrices that now carry valid
probability rows at every source-active age (51-64).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…filter handles unreachable per-target.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant