Skip to content

V2.0 update#55

Open
kaylode wants to merge 3 commits into
masterfrom
v2.0-update
Open

V2.0 update#55
kaylode wants to merge 3 commits into
masterfrom
v2.0-update

Conversation

@kaylode
Copy link
Copy Markdown
Owner

@kaylode kaylode commented Mar 24, 2026

Detailed Changes:

  • Logger Overhaul:

    • Fixed LoggerObserver singleton pattern for thread-safety and session persistence.
    • Added master rank check for distributed training in LoggerObserver.log to prevent duplicate logs across GPUs.
    • Improved LoggerObserver.text() to support multiple args and dictionary-to-JSON serialization.
    • Resolved log duplication by making loguru handlers (StdoutLogger/FileLogger) instance-specific using name filters.
  • Subpackage Decoupling:

    • Moved heavy domain-specific dependencies (lightning, torchvision, transformers) to optional extras.
    • Converted theseus/ and theseus/base/ init files to lazy-loadable structures (removed eager wildcards).
    • Implemented lazy imports in LoggerObserver to break global dependencies on PyTorch/Plotly.
  • ML Module Simplification:

    • Removed monolithic MLPipeline, MLTrainer, and custom ML-specific callbacks/metrics.
    • Replaced with streamlined tradml.py containing direct-fit functions and TradMLTuner with Optuna integration.
    • Fixed critical bugs in LabelEncode (lazy column initialization) and FillNaN (None-value handling).
  • CI/CD & Infrastructure:

    • Updated pyproject.toml and uv.lock to modernized dependency stack.
    • Rewrote tabular test suite in tests/tabular/ to use the new simplified functional API.
    • Preserved existing preprocessors, reduction, and visualization utilities.
  • Bump version to 2.0.0

  • Modernize all dependencies (Lightning 2.4+, wandb 0.17+, optuna 3.6+, etc.)

  • Add huggingface-hub and safetensors as core dependencies

  • Replace black/isort with ruff for linting and formatting

  • Modernize Dockerfile (CUDA 12.4, Ubuntu 22.04, uv)

  • Enhance Registry with generics, merge(), get_or_none(), len, getitem

  • Refactor BasePipeline with shared _PipelineBase, extracted helpers

  • Fix LightningModelWrapper autocast device detection, use optimizers() API

  • Refactor LoggerObserver with dispatch table (O(1) routing)

  • Make Metric an ABC with @AbstractMethod decorators

  • Cache inspect.signature in getter.py via lru_cache

  • Add FSDP strategy support to Trainer

  • Add HuggingFaceHubMixin (save/load/push with safetensors)

  • Add HuggingFaceHubCallback for automatic model pushing

  • Modernize all GitHub workflows (actions v4/v5, Python 3.11, uv, caching)

  • Add lint.yml workflow (ruff check + format)

  • Add release.yml workflow (PyPI publishing via trusted OIDC)

  • Create AGENT.md development guide

kaylode added 3 commits March 24, 2026 18:25
- Bump version to 2.0.0
- Modernize all dependencies (Lightning 2.4+, wandb 0.17+, optuna 3.6+, etc.)
- Add huggingface-hub and safetensors as core dependencies
- Replace black/isort with ruff for linting and formatting
- Modernize Dockerfile (CUDA 12.4, Ubuntu 22.04, uv)
- Enhance Registry with generics, merge(), get_or_none(), __len__, __getitem__
- Refactor BasePipeline with shared _PipelineBase, extracted helpers
- Fix LightningModelWrapper autocast device detection, use optimizers() API
- Refactor LoggerObserver with dispatch table (O(1) routing)
- Make Metric an ABC with @AbstractMethod decorators
- Cache inspect.signature in getter.py via lru_cache
- Add FSDP strategy support to Trainer
- Add HuggingFaceHubMixin (save/load/push with safetensors)
- Add HuggingFaceHubCallback for automatic model pushing
- Modernize all GitHub workflows (actions v4/v5, Python 3.11, uv, caching)
- Add lint.yml workflow (ruff check + format)
- Add release.yml workflow (PyPI publishing via trusted OIDC)
- Create AGENT.md development guide
…seus ML framework

Detailed Changes:

- Logger Overhaul:
    - Fixed `LoggerObserver` singleton pattern for thread-safety and session persistence.
    - Added master rank check for distributed training in `LoggerObserver.log` to prevent duplicate logs across GPUs.
    - Improved `LoggerObserver.text()` to support multiple args and dictionary-to-JSON serialization.
    - Resolved log duplication by making `loguru` handlers (StdoutLogger/FileLogger) instance-specific using name filters.

- Subpackage Decoupling:
    - Moved heavy domain-specific dependencies (lightning, torchvision, transformers) to optional extras.
    - Converted `theseus/` and `theseus/base/` __init__ files to lazy-loadable structures (removed eager wildcards).
    - Implemented lazy imports in `LoggerObserver` to break global dependencies on PyTorch/Plotly.

- ML Module Simplification:
    - Removed monolithic `MLPipeline`, `MLTrainer`, and custom ML-specific callbacks/metrics.
    - Replaced with streamlined `tradml.py` containing direct-fit functions and `TradMLTuner` with Optuna integration.
    - Fixed critical bugs in `LabelEncode` (lazy column initialization) and `FillNaN` (None-value handling).

- CI/CD & Infrastructure:
    - Updated pyproject.toml and uv.lock to modernized dependency stack.
    - Rewrote tabular test suite in `tests/tabular/` to use the new simplified functional API.
    - Preserved existing preprocessors, reduction, and visualization utilities.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant