-
Notifications
You must be signed in to change notification settings - Fork 194
Description
Feature Description
Currently, k6-operator starts the load test immediately after all runner pods are ready, without verifying that the target server is accessible from the runner pods. This causes issues when:
- Target servers have IP restrictions/allowlists that need to be configured before testing
- Network policies or firewall rules block initial connections
- Target services are temporarily unavailable during test startup
When IP restrictions are in place, the test runs but all requests fail continuously, leading to:
- Wasted test execution time with failed requests
- Difficulty distinguishing between actual load testing results and connectivity issues
- Need for manual coordination between infrastructure teams and testing teams
Use Case Example:
- Production servers with strict IP allowlists
- Staging environments with network security policies
- Cloud services requiring IP whitelisting
Suggested Solution
Add a configurable pre-test connectivity check feature to TestRun CRD that polls until connectivity is successful (instead of failing immediately):
Option 1: Built-in connectivity validation
apiVersion: k6.io/v1alpha1
kind: TestRun
metadata:
name: k6-sample
spec:
parallelism: 4
script:
configMap:
name: 'test'
file: 'test.js'
preTestValidation:
enabled: true
connectivityCheck:
url: "https://target-server.example.com/health"
method: "GET"
expectedStatusCodes: [200, 204]
timeout: "30s"
polling:
interval: "15s"
backoff:
type: exponential
factor: 1.5
maxInterval: "2m"
maxDuration: "30m"
successThreshold: 1
# Only start the actual test after successful connectivity validationOption 2: Custom validation script
apiVersion: k6.io/v1alpha1
kind: TestRun
metadata:
name: k6-sample
spec:
parallelism: 4
script:
configMap:
name: 'test'
file: 'test.js'
preTestValidation:
enabled: true
validationScript:
configMap:
name: 'validation-script'
file: 'validate.js'
polling:
interval: "20s"
maxDuration: "45m"
# Script exits 0 on success, non-0 on failureImplementation Details:
-
Execute validation from one of the runner pods before starting the actual test
-
If validation fails, retry at the configured interval with optional backoff until either:
- Connectivity succeeds (then proceed to the test), or
- The maximum duration is exceeded (mark TestRun as
ValidationFailed)
-
Provide clear error messages and retry events in TestRun status for troubleshooting
-
Add appropriate status conditions (
ValidationPending,ValidationRetrying,ValidationSucceeded,ValidationDeadlineExceeded)
Benefits:
- Prevents wasted test runs due to connectivity issues
- Provides clear feedback about network/access problems
- Enables better coordination between teams
- Improves test reliability and results accuracy
Already existing or connected issues / PRs
Related issues that might benefit from this feature:
- The k6 Operator is not working #102 - Network connectivity issues causing test failures
- k6-operator manager freeze and don't recover after reaching timeout in initialize phase #128 - Timeout issues during initialization phase