Skip to content

Kafka rebalance executed is not the same that you approved #12199

@scholzj

Description

@scholzj

Originally posted by @dariocazas in https://github.com/orgs/strimzi/discussions/12193


I am doing some experiments with KafkaRebalance CR, and I notice:

  • When I request a proposal for a rebalance, I get some optimisation result...
  • When I approve the KafkaRebalance CR (less than 1 minute after get the ProposalReady state), when the CR moved into Rebalancing, the optimisation result changed

For example (and this is not the unique execution):

KafkaRebalance before approval
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaRebalance
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kafka.strimzi.io/v1beta2","kind":"KafkaRebalance","metadata":{"annotations":{},"labels":{"strimzi.io/cluster":"my-cluster"},"name":"rebalance-request","namespace":"my-namespace"},"spec":{}}
  creationTimestamp: "2025-12-01T17:17:37Z"
  generation: 1
  labels:
    strimzi.io/cluster: my-cluster
  name: rebalance-request
  namespace: my-namespace
  resourceVersion: "1411465137"
  uid: c3574075-a164-407f-ab5b-fa21fd6ebaf6
spec: {}
status:
  conditions:
  - lastTransitionTime: "2025-12-01T17:17:40.453269532Z"
    status: "True"
    type: ProposalReady
  observedGeneration: 1
  optimizationResult:
    afterBeforeLoadConfigMap: rebalance-request
    dataToMoveMB: 18033
    excludedBrokersForLeadership: []
    excludedBrokersForReplicaMove: []
    excludedTopics: []
    intraBrokerDataToMoveMB: 0
    monitoredPartitionsPercentage: 100
    numIntraBrokerReplicaMovements: 0
    numLeaderMovements: 999
    numReplicaMovements: 799
    onDemandBalancednessScoreAfter: 88.64420023211895
    onDemandBalancednessScoreBefore: 70.37164300879449
    provisionRecommendation: ""
    provisionStatus: RIGHT_SIZED
    recentWindows: 1
  sessionId: a66855b2-48fc-4e51-84a2-0affa8f03563
KafkaRebalance after approval
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaRebalance
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kafka.strimzi.io/v1beta2","kind":"KafkaRebalance","metadata":{"annotations":{},"labels":{"strimzi.io/cluster":"my-cluster"},"name":"rebalance-request","namespace":"my-namespace"},"spec":{}}
  creationTimestamp: "2025-12-01T17:17:37Z"
  generation: 1
  labels:
    strimzi.io/cluster: my-cluster
  name: rebalance-request
  namespace: my-namespace
  resourceVersion: "1411466488"
  uid: c3574075-a164-407f-ab5b-fa21fd6ebaf6
spec: {}
status:
  conditions:
  - lastTransitionTime: "2025-12-01T17:18:19.565304272Z"
    status: "True"
    type: Rebalancing
  observedGeneration: 1
  optimizationResult:
    afterBeforeLoadConfigMap: rebalance-request
    dataToMoveMB: 7525
    excludedBrokersForLeadership: []
    excludedBrokersForReplicaMove: []
    excludedTopics: []
    intraBrokerDataToMoveMB: 0
    monitoredPartitionsPercentage: 100
    numIntraBrokerReplicaMovements: 0
    numLeaderMovements: 553
    numReplicaMovements: 706
    onDemandBalancednessScoreAfter: 88.64420023211895
    onDemandBalancednessScoreBefore: 70.37164300879449
    provisionRecommendation: ""
    provisionStatus: RIGHT_SIZED
    recentWindows: 1
  sessionId: 97949d1d-f6f6-47e3-bb4f-68f37c12c190

As you can see in the CRs:

  • The time between ProposalReady and Rebalancing is 39s
  • dataToMoveMB proposed is 18G, but finally to be rebalanced is 7G
  • numLeaderMovements proposed is 999, but finally to be rebalanced is 553
  • ...
  • Curios part: the scores after and before are the same

During my test, I try:

  • The shared execution (without goals, to use the precomputed proposal)
  • Execute with specific goals (different of the defaults configured), with similar effect
  • Change the proposal.expiration.ms from the default 15m to 24h (just in case...)

So... my questions are:

  • Is this expected?
  • And if the answer is yes... Do the approval flow made sense? Because in the end, you are executing a blind proposal...

I opened the question in Slack first https://cloud-native.slack.com/archives/CMH3Q3SNP/p1764611155301549

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions