chore: parallelize python integration tests#435
Conversation
There was a problem hiding this comment.
Pull request overview
Parallelizes the Python integration test suite using pytest-xdist by sharing a single Docker-based Fluss cluster across workers, aligning the Python CI behavior with the existing multi-language testing approach.
Changes:
- Updated Python test fixtures to start/reuse a shared Docker cluster across xdist workers and clean it up in
pytest_unconfigure. - Added
pytest-xdistto Python dev dependencies. - Updated the Python CI workflow to run tests with
-n autoand expanded.gitignorefor macOS artifacts.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| bindings/python/test/conftest.py | Implements xdist-aware shared cluster startup/reuse and centralized cleanup; adds readiness/connection retry helpers. |
| bindings/python/pyproject.toml | Adds pytest-xdist to the dev optional dependency set. |
| .github/workflows/build_and_test_python.yml | Runs Python integration tests in parallel via pytest -n auto. |
| .gitignore | Ignores macOS build/debug artifacts (*.dylib, *.dSYM/). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Addressed comments |
luoyuxia
left a comment
There was a problem hiding this comment.
@fresh-borzoni Thanks for the pr.
charlesdong1991
left a comment
There was a problem hiding this comment.
Nice PR, small thoughts but non-blocking!
| zookeeper.start() | ||
| coordinator.start() | ||
| tablet_server.start() | ||
| except Exception as e: |
There was a problem hiding this comment.
this is a bit broad here, maybe consider narrowing to docker specific errors?
btw IIUC, we want to start fluss clsuter exactly once per test session
according to pytest-xdist: https://pytest-xdist.readthedocs.io/en/latest/how-to.html#making-session-scoped-fixtures-execute-only-once , they recommend to use file lock to coordinate startup
There was a problem hiding this comment.
Good call 👍
file lock approcach is cleaner, but it requires more restructuring, so I just used the same approach as in cpp for consistency.
Let's create follow up if someone wishes to improve this and play with XDIST.
I don't expect tests time to change though :)
| wait for the other worker's cluster to become ready. | ||
| """ | ||
| # Reuse cluster started by another parallel worker or previous run. | ||
| if _wait_for_port("localhost", PLAIN_CLIENT_PORT, timeout=1): |
There was a problem hiding this comment.
i am curious if 1s too short?
There was a problem hiding this comment.
it's a fast check, so we are just quickly checking if there is warm cluster running already, usual integration CI/CD would always have this fast path failed.
For local dev if i wish to quickly iterate on tests, i just comment out cleanup and the next run would reuse existing cluster and run superfast, as the cluster is already warm. So I left this hook if someone wishes to use it as well.
One improvement: we may use some env var and wire it in a such way that we can specify to retain cluster and skip cleanup. But it's followup, the main focus was integration tests.
|
|
||
| def _run_cmd(cmd): | ||
| """Run a command (list form), return exit code.""" | ||
| return subprocess.run(cmd, capture_output=True).returncode |
There was a problem hiding this comment.
if run fails, we will only see error code, but not the error message now, not sure if it will hinder debugging if needed?
There was a problem hiding this comment.
for now it's used for docker network call only, so not much to debug, but valid observation if used for more stuff later 👍
Summary
closes #372
Pallelize Python integration tests using pytest-xdist, matching the C++/Rust pattern. First worker starts the Docker cluster, others detect and reuse it via port check. Containers are cleaned up after all workers finish via pytest_unconfigure.