feat: robust i18n pipeline with CI integration and nested key support#5
feat: robust i18n pipeline with CI integration and nested key support#50xPepeSilvia wants to merge 6 commits intotari-project:mainfrom
Conversation
Replaces legacy flat-file scripts with a proper Python package (aiteen/). Adds CLI (click), config loading (YAML + dotenv), deep-merge patch logic, placeholder-aware QA, and full nested-key audit support. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
74 tests covering: nested-key detection (the case competitor PR tari-project#2 missed), placeholder preservation, deep-merge correctness, QA placeholder mismatch reporting, retry logic, and missing-API-key error handling. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
translate.yml: triggers on changes to locales/en/**/*.json, runs the full aiteen pipeline, and commits the results via git-auto-commit-action. test.yml: runs pytest on push/PR across Python 3.10, 3.11, 3.12. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
examples/universe.yaml — 12 target languages for tari-project/universe. examples/wxtm-bridge.yaml — 4 target languages for the WXTM Bridge frontend. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces Aiteen v2, a robust i18n translation pipeline for Tari projects, featuring modules for auditing missing translations, OpenAI-driven translation, locale file patching, and quality assurance. The project structure has been updated to use setuptools, and a comprehensive test suite is included. Feedback identifies a logic flaw in the patching module that could lead to data loss during the unflattening of nested keys, and points out a redundant return statement in the CLI's error handling.
| # Convert existing to flat, override, then unflatten — preserves order | ||
| # of source keys while letting us write nested dotted keys cleanly. | ||
| flat_existing = flatten(existing) | ||
| for key, val in flat_translations.items(): | ||
| flat_existing[key] = val | ||
| merged = unflatten(flat_existing) | ||
| # Deep-merge against the original existing dict to keep any keys | ||
| # `flatten` may have collapsed (e.g. empty-dict leaves). | ||
| merged = deep_merge(existing, merged) |
There was a problem hiding this comment.
The current logic for merging translations is flawed and can lead to data loss. By flattening the existing data and the new translations into a single dictionary and then unflattening, you create conflicts when one key is a prefix of another (e.g., a and a.b). The unflatten function is not designed to handle this and will destructively overwrite values depending on key order, causing either existing data or new translations to be lost.
A safer and simpler approach is to unflatten only the new translations and then deep-merge the resulting structure into the existing data. This correctly handles adding nested keys without the risk of data loss from key conflicts during the unflattening process.
| # Convert existing to flat, override, then unflatten — preserves order | |
| # of source keys while letting us write nested dotted keys cleanly. | |
| flat_existing = flatten(existing) | |
| for key, val in flat_translations.items(): | |
| flat_existing[key] = val | |
| merged = unflatten(flat_existing) | |
| # Deep-merge against the original existing dict to keep any keys | |
| # `flatten` may have collapsed (e.g. empty-dict leaves). | |
| merged = deep_merge(existing, merged) | |
| # Unflatten the new translations and deep-merge them into the existing data. | |
| new_translations = unflatten(flat_translations) | |
| merged = deep_merge(existing, new_translations) |
| cfg = _make_config(ctx.obj) | ||
| except (click.UsageError, FileNotFoundError, ValueError, OSError) as e: | ||
| _handle_error(e) | ||
| return |
- patch.py: unflatten only the new translations and deep-merge into existing instead of flattening both dicts together; the old approach caused data loss when one key was a prefix of another (e.g. 'a' and 'a.b') because unflatten would destructively overwrite values - cli.py: remove unreachable `return` after `_handle_error(e)` in run_all(); _handle_error calls sys.exit(1) so the return never executes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Both gemini issues addressed in HIGH — data loss in patch.py: Removed the flatten-merge-unflatten pattern. Now we MEDIUM — unreachable |
Closes #1
Summary
aiteenPython package (audit, translate, patch, qa, cli, config modules)dict.update()(another PR fix: Replace with a supported robust solution #2 failure)aiteen audit | translate | patch | qa | run-all) with--dry-run,--fail-on-missing,--fail-on-issueflagsAcceptance Criteria
aiteen/audit.py::find_missing,tests/test_audit.py::test_detects_missing_top_level_keysaiteen/audit.py::flatten,tests/test_audit.py::test_detects_missing_nested_keys{count},<b>) in translationsaiteen/translate.py::PLACEHOLDER_RE,tests/test_translate.py::test_placeholder_*aiteen/qa.py::qa_locale,tests/test_qa.py::test_catches_placeholder_strippingaiteen/patch.py::deep_merge,tests/test_patch.py::test_deep_merge_nestedaiteen/cli.py::_handle_error, adversarial testingDifferentiators vs Competitor PRs
PR #2 (ledgerpilot):
openai.CompletionAPI (removed in openai>=1.0) — will crash immediatelydict.update()(shallow merge) — overwrites entire nested sectionsPR #3 (Tolgee):
pip install, oneOPENAI_API_KEYenv varPR #4:
Test Results
All 74 tests pass across:
test_audit.py(18),test_patch.py(16),test_qa.py(15),test_translate.py(25).CI Workflows
.github/workflows/translate.yml— triggers on**/locales/en/**/*.jsonchanges in any PR, runsaiteen run-all, auto-commits translations.github/workflows/test.yml— runs pytest on Python 3.10, 3.11, 3.12 on push and PR to mainConfig Examples
tari-project/universe (
examples/universe.yaml): 12 languages (ar, de, es, fr, id, ja, ko, pt, ru, tr, vi, zh)WXTM Bridge (
examples/wxtm-bridge.yaml): 4 languages (ar, de, es, fr)Usage:
Adversarial Testing Passed
Error: Malformed JSON in de/common.json: line 1 col 2(exit 1, no stack trace)OPENAI_API_KEY→Error: OPENAI_API_KEY is not set. Provide it via environment variable...(exit 1)Error: Locales directory not found: /path/to/dir(exit 1){count}stripped from translation →PLACEHOLDER_MISMATCHissue reported🤖 Generated with Claude Code