Skip to content

[BUG] Ensure benchmark job pod to be scheduled on right node #408

@XinyueZhang369

Description

@XinyueZhang369

What happened?

Benchmark job pod can be scheduled on node that doesn't actually have the model due to lack of nodeAffinity

What did you expect to happen?

Benchmark job pod should have the same nodeAffinity as the inference service in its spec, so that it can run successfully

How can we reproduce it (as minimally and precisely as possible)?

Start inference service

apiVersion: ome.io/v1beta1
kind: InferenceService
metadata:
  name: config-test
  namespace: actions-runner-system
  annotations:
    sglang.deployed-by: "manual-test"
spec:
  engine:
    minReplicas: 1
    maxReplicas: 1
  model:
    name: llama-4-scout-17b-16e-instruct

Create benchmark job

apiVersion: ome.io/v1beta1
kind: BenchmarkJob
metadata:
  name: benchmark-llama-4-scout-17b-16e-instruct
  namespace: actions-runner-system
spec:
  podOverride:
    image: fra.ocir.io/idqj093njucb/xz-genai-bench:dev
  huggingFaceSecretReference:
    name: huggingface-secret
  endpoint:
    inferenceService:
      name: llama-4-scout-17b-16e-instruct
      namespace: actions-runner-system
  task: text-to-text
  maxTimePerIteration: 10
  maxRequestsPerIteration: 1000
  outputLocation:
    storageUri: "oci://n/idqj093njucb/b/ome-benchmark-results/o/official-sgl/test/llama-4-scout-17b-16e-instruct"
    parameters:
      auth: "instance_principal"
      region: "eu-frankfurt-1"

Anything else we need to know?

Environment

  • OME version: ord.ocir.io/idqj093njucb/ome-manager:v0.1.4-36-g0e8110c
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g., from /etc/os-release):
  • Runtime (SGLang, vLLM, etc.) and version:
  • Model being served (if applicable):
  • Install method (Helm, kubectl, etc.):

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions