Skip to content

chore: parallelize python integration tests#435

Merged
luoyuxia merged 2 commits intoapache:mainfrom
fresh-borzoni:parallelize-python-tests
Mar 8, 2026
Merged

chore: parallelize python integration tests#435
luoyuxia merged 2 commits intoapache:mainfrom
fresh-borzoni:parallelize-python-tests

Conversation

@fresh-borzoni
Copy link
Contributor

Summary

closes #372

Pallelize Python integration tests using pytest-xdist, matching the C++/Rust pattern. First worker starts the Docker cluster, others detect and reuse it via port check. Containers are cleaned up after all workers finish via pytest_unconfigure.

@fresh-borzoni
Copy link
Contributor Author

@fresh-borzoni fresh-borzoni changed the title chore: parallelize python tests chore: parallelize python integration tests Mar 7, 2026
@luoyuxia luoyuxia requested a review from Copilot March 8, 2026 02:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Parallelizes the Python integration test suite using pytest-xdist by sharing a single Docker-based Fluss cluster across workers, aligning the Python CI behavior with the existing multi-language testing approach.

Changes:

  • Updated Python test fixtures to start/reuse a shared Docker cluster across xdist workers and clean it up in pytest_unconfigure.
  • Added pytest-xdist to Python dev dependencies.
  • Updated the Python CI workflow to run tests with -n auto and expanded .gitignore for macOS artifacts.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File Description
bindings/python/test/conftest.py Implements xdist-aware shared cluster startup/reuse and centralized cleanup; adds readiness/connection retry helpers.
bindings/python/pyproject.toml Adds pytest-xdist to the dev optional dependency set.
.github/workflows/build_and_test_python.yml Runs Python integration tests in parallel via pytest -n auto.
.gitignore Ignores macOS build/debug artifacts (*.dylib, *.dSYM/).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@fresh-borzoni
Copy link
Contributor Author

Addressed comments

@luoyuxia luoyuxia merged commit bb6933a into apache:main Mar 8, 2026
8 checks passed
Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fresh-borzoni Thanks for the pr.

Copy link
Contributor

@charlesdong1991 charlesdong1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR, small thoughts but non-blocking!

zookeeper.start()
coordinator.start()
tablet_server.start()
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit broad here, maybe consider narrowing to docker specific errors?

btw IIUC, we want to start fluss clsuter exactly once per test session
according to pytest-xdist: https://pytest-xdist.readthedocs.io/en/latest/how-to.html#making-session-scoped-fixtures-execute-only-once , they recommend to use file lock to coordinate startup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call 👍
file lock approcach is cleaner, but it requires more restructuring, so I just used the same approach as in cpp for consistency.

Let's create follow up if someone wishes to improve this and play with XDIST.
I don't expect tests time to change though :)

wait for the other worker's cluster to become ready.
"""
# Reuse cluster started by another parallel worker or previous run.
if _wait_for_port("localhost", PLAIN_CLIENT_PORT, timeout=1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am curious if 1s too short?

Copy link
Contributor Author

@fresh-borzoni fresh-borzoni Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a fast check, so we are just quickly checking if there is warm cluster running already, usual integration CI/CD would always have this fast path failed.

For local dev if i wish to quickly iterate on tests, i just comment out cleanup and the next run would reuse existing cluster and run superfast, as the cluster is already warm. So I left this hook if someone wishes to use it as well.

One improvement: we may use some env var and wire it in a such way that we can specify to retain cluster and skip cleanup. But it's followup, the main focus was integration tests.


def _run_cmd(cmd):
"""Run a command (list form), return exit code."""
return subprocess.run(cmd, capture_output=True).returncode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if run fails, we will only see error code, but not the error message now, not sure if it will hinder debugging if needed?

Copy link
Contributor Author

@fresh-borzoni fresh-borzoni Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now it's used for docker network call only, so not much to debug, but valid observation if used for more stuff later 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run python integration tests in parallel

4 participants