[BUG] Inference Pod can't be scheduled

## What happened?

The iInference pod in pending state due to 2 node(s) didn't match Pod's node affinity/selector. The pods node selector is "models.ome.io/clusterbasemodel.llama-3-2-1b=Ready" and the 2 GPU nodes have a label of "models.ome.io/clusterbasemodel.llama-3-2-1b=Ready".

## What did you expect to happen?

I would expect that the pod could be scheduled on the GPU nodes as it's node selector matches the node label.

## How can we reproduce it (as minimally and precisely as possible)?



```yaml
apiVersion: ome.io/v1beta1
kind: ClusterBaseModel
metadata:
  name: llama-3-2-1b
spec:
  vendor: meta
  version: "3.2"
  disabled: false
  modelType: llama
  modelArchitecture: LlamaForCausalLM
  modelParameterSize: "1B"
  maxTokens: 8192
  modelCapabilities:
    - text-to-text
  modelFormat:
    name: safetensors
    version: "1.0.0"
  modelFramework:
    name: transformers
    version: "4.43.0"
  storage:
    storageUri: "hf://meta-llama/Llama-3.2-1B-Instruct"
    path: "/models/llama-3.2-1b"
    key: "hf-token"
    parameters:
      secretKey: token
    nodeSelector:
      node.kubernetes.io/instance-type: g6.xlarge```

And the Below is the Inference YAML

```yaml
apiVersion: v1
kind: Namespace
metadata:
  name: llama-1b-demo-2
---
apiVersion: ome.io/v1beta1
kind: InferenceService
metadata:
  name: llama-3-2-1b
  namespace: llama-1b-demo-2
spec:
  predictor:
    model:
      baseModel: llama-3-2-1b
      protocolVersion: openAI
    minReplicas: 1
    maxReplicas: 1```


## Anything else we need to know?



## Environment

- OME version: 0.1.3
- Kubernetes version (use `kubectl version`):
- Cloud provider or hardware configuration: GPU Nvidia L4
- OS (e.g., from `/etc/os-release`): Ubuntu 22.04
- Runtime (SGLang, vLLM, etc.) and version: SGlang
- Model being served (if applicable): llama 3.2 1B
- Install method (Helm, kubectl, etc.): Helm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Inference Pod can't be scheduled #319

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Inference Pod can't be scheduled #319

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions