Skip to content

msft-preview: 3.31.0 rebase#448

Merged
Redent0r merged 7 commits into
up/3.31.0from
saul/msft-preview2
Jun 1, 2026
Merged

msft-preview: 3.31.0 rebase#448
Redent0r merged 7 commits into
up/3.31.0from
saul/msft-preview2

Conversation

@Redent0r

@Redent0r Redent0r commented May 20, 2026

Copy link
Copy Markdown
Test Methodology

CI: https://dev.azure.com/mariner-org/mariner/_build/results?buildId=1127027&view=logs&j=1d103282-d184-539c-6f02-9cecc7887239&t=7581b7d8-55b2-5d9a-91e4-4f9c30e5e89d:

How are current diff (msft-preview) looks like for comparison: #451

Worth to note changes:

@Redent0r Redent0r force-pushed the saul/msft-preview2 branch 2 times, most recently from eb49730 to ac1ca3d Compare May 21, 2026 18:45
@Redent0r

Copy link
Copy Markdown
Author

todo: without ac1ca3d , uvm build fails with https://dev.azure.com/mariner-org/mariner/_build/results?buildId=1122834&view=logs&j=bbe135f3-541e-58e6-8b02-61d4669cdf5f&t=5095dea9-1b86-57b3-726c-ffe250f063d1&l=1963 . Figure out what changed, if we need to upstream this:

@Redent0r Redent0r force-pushed the saul/msft-preview2 branch 3 times, most recently from e84953f to f7bb7a0 Compare May 22, 2026 17:00
Comment thread AGENT.md Outdated
@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from f7bb7a0 to 7d0903d Compare May 22, 2026 17:49
@Redent0r

Copy link
Copy Markdown
Author

todo: without ac1ca3d , uvm build fails with https://dev.azure.com/mariner-org/mariner/_build/results?buildId=1122834&view=logs&j=bbe135f3-541e-58e6-8b02-61d4669cdf5f&t=5095dea9-1b86-57b3-726c-ffe250f063d1&l=1963 . Figure out what changed, if we need to upstream this:

Mystery solved. Due to shellcheck fix in kata-containers@6471894#diff-9cff33aa4403bdc6b3a82a5fb4abd4aa4c6038938f8d0f63bbf9c5e9456f7695 , particularly updating

local agent_version=$(cat ${agentdir}/VERSION 2> /dev/null)
[ -z "$agent_version" ] && agent_version="unknown"

to

local agent_version
agent_version=$(cat "${agentdir}/VERSION" 2> /dev/null)
[[ -z "${agent_version}" ]] && agent_version="unknown"

make us fail due to not copying VERSION during package_tools_install.sh. So we'll copy that now.

Demo showing change in behaviour.

#!/usr/bin/env bash

set -o errexit
set -o nounset
set -o pipefail

missing_file="/definitely-missing-file"

demo_local_declaration_assignment() {
	echo "demo 1: local declaration with command substitution"
	local value=$(cat "${missing_file}" 2>/dev/null)
	[[ -z "${value}" ]] && value="unknown"
	echo "demo 1 result: ${value}"
}

demo_split_declaration_assignment() {
	echo "demo 2: split declaration and assignment"
	local value
	value=$(cat "${missing_file}" 2>/dev/null)
	[[ -z "${value}" ]] && value="unknown"
	echo "demo 2 result: ${value}"
}

main() {
	echo "Running with set -euo pipefail"
	echo "Missing file: ${missing_file}"
	echo

	echo "Expect demo 1 to survive and print unknown."
	demo_local_declaration_assignment
	echo "demo 1 completed"
	echo

	echo "Expect demo 2 to exit before printing a result."
	demo_split_declaration_assignment
	echo "demo 2 completed"
}

main "$@"

@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from 7d0903d to d87a305 Compare May 22, 2026 22:13
@Redent0r Redent0r marked this pull request as ready for review May 26, 2026 14:50
@Redent0r Redent0r marked this pull request as draft May 26, 2026 14:51
@Redent0r Redent0r force-pushed the saul/msft-preview2 branch 2 times, most recently from a543c5f to 7977900 Compare May 26, 2026 23:09
@Redent0r Redent0r marked this pull request as ready for review May 26, 2026 23:09
@Redent0r

Copy link
Copy Markdown
Author

DO NOT MERGE: we'll force push to msft-preview instead once approved

@Redent0r Redent0r requested a review from sprt May 26, 2026 23:11
@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from 7977900 to de5785f Compare May 27, 2026 00:27
@sprt sprt removed the do-not-merge label May 27, 2026

@sprt sprt left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This almost LGTM! A few asks:

  • Can you change the first commit message to this:
ci: switch default branch to msft-preview

* Update the default branch to msft-preview in different places for
  the CI to work with our fork.
* Add the MSFT-required SECURITY.md and corresponding dictionary entries.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
  • Can you link to passing BB/conformance/performance test runs in the PR description?

Re: the actual merge, I actually think it's better to lean even more into this up/3.31.0 branch to have a paper trail. So I removed the WIP label and when this PR is fully ready, let us:

  1. Ensure the gatekeeper is green.
  2. Get approvals.
  3. Merge this PR into up/3.31.0 as is (WITH the last commit with the temporary up/3.31.0 CI branch changes).
  4. Then we can just do the following to force push msft-preview:
git checkout msft-preview
git reset --hard up/3.31.0 # This clones up/3.31.0 into msft-preview
git reset --hard HEAD^ # This removes the last commit with the temporary up/3.31.0 CI branch changes
git push -f # Push the new msft-preview

@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from de5785f to f5bb7ae Compare May 27, 2026 20:28
@Redent0r

Copy link
Copy Markdown
Author

This almost LGTM! A few asks:

  • Can you change the first commit message to this:
ci: switch default branch to msft-preview

* Update the default branch to msft-preview in different places for
  the CI to work with our fork.
* Add the MSFT-required SECURITY.md and corresponding dictionary entries.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
  • Can you link to passing BB/conformance/performance test runs in the PR description?

Re: the actual merge, I actually think it's better to lean even more into this up/3.31.0 branch to have a paper trail. So I removed the WIP label and when this PR is fully ready, let us:

  1. Ensure the gatekeeper is green.
  2. Get approvals.
  3. Merge this PR into up/3.31.0 as is (WITH the last commit with the temporary up/3.31.0 CI branch changes).
  4. Then we can just do the following to force push msft-preview:
git checkout msft-preview
git reset --hard up/3.31.0 # This clones up/3.31.0 into msft-preview
git reset --hard HEAD^ # This removes the last commit with the temporary up/3.31.0 CI branch changes
git push -f # Push the new msft-preview

addressed and trying to re-require a few more tests (will squash if it works)

@Redent0r Redent0r force-pushed the saul/msft-preview2 branch 4 times, most recently from ade9dad to c49acbe Compare May 27, 2026 23:48
@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from c49acbe to 8c8ccbf Compare May 28, 2026 16:32
* Update the default branch to msft-preview in different places for
  the CI to work with our fork.
* Add the MSFT-required SECURITY.md and corresponding dictionary entries.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from 8c8ccbf to 89a7d3f Compare May 28, 2026 16:38

@sprt sprt left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit as we wait for the test results!

Comment thread tools/osbuilder/node-builder/azure-linux/common.sh Outdated
Manuel Huber and others added 6 commits May 28, 2026 13:26
For runtime-go and runtime-rs. See below for details

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

tools: Add initial igvm-builder and node-builder/azure-linux scripting

This branch starts introducing additional scripting to build, deploy
and evaluate the components used in AKS' Pod Sandboxing and
Confidential Containers preview features. This includes the capability
to build the IGVM file and its reference measurement file for remote
attestation.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Improve igvm-builder and node-builder/azure-linux scripting

- Support for Mariner 3 builds using OS_VERSION variable
- Improvements to IGVM build process and flow as described in README
- Adoption of using only cloud-hypervisor-cvm on CBL-Mariner

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Add package-tools-install functionality

- Add script to install kata-containers(-cc)-tools bits
- Minor improvements in README.md
- Minor fix in package_install
- Remove echo outputs in package_build

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Enable setting IGVM SVN

- Allow setting SVN parameter for IGVM build scripting

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: introduce BUILD_TYPE variable

This lets developers build and deploy Kata in debug mode without having to make
manual edits to the build scripts.

With BUILD_TYPE=debug (default is release):

 * The agent is built in debug mode.
 * The agent is built with a permissive policy (using allow-all.rego).
 * The shim debug config file is used, ie. we create the symlink
   configuration-clh-snp-debug.toml <- configuration-clh-snp.toml.

For example, building and deploying Kata-CC in debug mode is now as simple as:

   make BUILD_TYPE=debug all-confpods deploy-confpods

Also do note that make still lets you override the other variables even after
setting BUILD_TYPE. For example, you can use the production shim config with
BUILD_TYPE=debug:

   make BUILD_TYPE=debug SHIM_USE_DEBUG_CONFIG=no all-confpods deploy-confpods

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: introduce SHIM_REDEPLOY_CONFIG

See README: when SHIM_REDEPLOY_CONFIG=no, the shim configuration is NOT
redeployed, so that potential config changes made directly on the host
during development aren't lost.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: Use img for Pod Sandboxing

Switch from UVM initrd to image format

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Adapt README instructions

- Sanitize containerd config snippet
- Set podOverhead for Kata runtime class

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Adapt AGENT_POLICY_FILE path

- Adapt path in uvm_build.sh script to comply
  with the usptream changes we pulled in

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Use Azure Linux 3 as default path

- update recipe and node-builder scripting
- change default value on rootfs-builder

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Deploy-only for AzL3 VMs

- split deployment sections in node-builder README.md
- install jq, curl dependencies within IGVM script
- add path parameter to UVM install script

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Minor updates to README.md

- no longer install make package, is part of meta package
- remove superfluous popd
- add note on permissive policy for ConfPods UVM builds

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Updates to README.md

- with the latest 3.2.0.azl4 package on PMC, can remove OS_VERSION parameter
  and use the make deploy calls instead of copying files by hand for variant
  I (now aligned with Variant II)
- with the latest changes on msft-main, set the podOverhead to 600Mi

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Fix SHIM_USE_DEBUG_CONFIG behavior

Using a symlink would create a cycle after calling this script again when
copying the final configuration at line 74 so we just use cp instead.

Also, I moved this block to the end of the file to properly override the final
config file.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: Build and install debug configuration for pod sandboxing

For ease of debugging, install a configuration-clh-debug.toml for pod
sandboxing as we do in Conf pods.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>

runtime: remove clh-snp config file usage in makefile

Not needed to build vanilla kata

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

package_tools_install.sh: include nsdax.gpl.c

Include nsdax.gpl.c

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

node-builder: fix typo in string comparison

This also fixes a shellcheck error and lets us require the
shellcheck-required job:

In ./tools/osbuilder/node-builder/azure-linux/uvm_build.sh line 34:
        if [ -z "${UVM_KERNEL_HEADER_DIR}}" ]; then
                                         ^-- SC2157 (error): Argument to -z is always false due to literal strings.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

docs: node-builder: fix static check error

This fixes the below static check error to follow up on the infra fix from
kata-containers#11646:

2025-07-31T19:32:45.0031829Z time="2025-07-31T19:32:44.990004665Z" level=fatal msg="found 2 parse errors:\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Set up environment\" (heading: {Name:Set up environment MDName:Set up environment LinkName:set-up-environment Level:2})\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Install build dependencies\" (heading: {Name:Install build dependencies MDName:Install build dependencies LinkName:install-build-dependencies Level:2})" commit=1d17f56b1aa7a880468b8e25d14467c92dca8eeb name=kata-check-markdown pid=9075 source=check-markdown version=0.0.1

Note: that is likely flagged because having two headings with the same
name, even under different sections, makes it impossible to create a
canonical heading link in Markdown.

This should eventually be squashed into the node-builder commit.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

docs: node-builder: Remove references to moby-containerd-cc

As we adopted containerd2, we remove references to our prior
forked containerd version.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: 2Mb aligned guest image size

Build the mariner guest image using IMAGE_SIZE_ALIGNMENT_MB=2.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>

to-squash: node-builder: add reference to README.md

This is needed to avoid the following static-checks error:

2025-08-05T21:27:20.0028337Z [static-checks.sh:808] ERROR: Document tools/osbuilder/node-builder/azure-linux/README.md is not referenced

This commit is to be squashed into the node-builder commit.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: build and install runtime-rs
Build and install both runtime-rs and runtime-go configs and binaries side by side:
  - runtime-go:
      /usr/local/bin/containerd-shim-kata-v2-go
      /usr/local/share/defaults/kata-containers/configuration-clh.toml
      /usr/local/share/defaults/kata-containers/configuration-clh-debug.toml

  - runtime-rs:
      /usr/local/bin/containerd-shim-kata-v2-rs
      /usr/local/share/defaults/kata-containers/configuration-cloud-hypervisor.toml
      /usr/local/share/defaults/kata-containers/configuration-cloud-hypervisor-debug.toml

Also add USE_RUNTIME_RS variable and default to "yes". This controls which runtime binary and configuration will be installed
to /usr/local/bin/containerd-shim-kata-v2 and /usr/local/share/defaults/kata-containers/configuration.toml respectively.

Also install kata-ctl (runtime-rs equivalent of kata-runtime) so we can exec into the UVM when using runtime-rs

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
- if no limits are specified, assign a default static amount of memory (512Mi)
and vcpu (1) to the UVM
- if limits are specified, use those limit values for the UVM resources (don't add any extra)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

runtime: Resolve high UVM memory footprint

Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/43668151

Rationale: This is a temporary solution for optimizing memory usage for
the current mechanism of requesting resources through pod Limit
annotations:
- if no Limits are specified and hence WorkloadMemMB is 0, set a default
  value 'StaticWorkloadDefaultMem' to allocate a default amount of
  memory for use for containers in the sandbox in addition to the base
  memory
- if Limits are specified, the base memory and the sum of Limits are
  allocated. The end user needs to be aware of the minimum memory
  requirements for their pods, otherwise the pod will be stuck in the
  ContainerCreating state

Testing: Manual testing, creating pods with Limits and without limits,
and with two containers where each container has a limit, tested with
integration in a SPEC file where the config variables were set via
environment variables via the make command

Adapted by @mfrw from 3.1.0 to apply to 3.2.0

Signed-off-by: Muhammad Falak R Wani <mwani@microsoft.com>
Signed-off-by: Manuel Huber <mahuber@microsoft.com>

runtime: Remove unused VMM options for mem alloc

- We only ever tested these fork changes with CLH+MSHV
- Remove these options as we don't use QEMU/FC

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

runtime: improved memory overhead management

After these changes:

1. The value of the K8s runtime class memory overhead:
   - Covers the memory usage from all the Host-side components (mainly
     the Kata Shim and the VMM).
   - Doesn't include the memory usage from any Guest-side components.

2. The value of a pod memory limit specified by the user:
   - Is equal to the memory size of the Pod VM.
   - Includes the memory usage from all the Guest-side components
     (mainly user's workload, the Guest kernel, and the Kata Agent)
   - Doesn't include the memory usage from any Host-side components.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>

runtime: fix `make test`

This addresses the following errors from `make test` to allow us to require
that upstream CI:

https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142422035?pr=392#step:13:53

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

runtime: Allocate default workload vcpus

- similar to the static_sandbox_default_workload_mem option,
  assign a default number of vcpus to the VM when no limits
  are given, 1 vcpu in this case
- similar to commit c7b8ee9, do not allocate additional vcpus
  when limits are provided

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
- if no limits are specified, assign a default static amount of memory (512Mi) and vcpu (1) to the UVM
- if limits are specified, use those limit values for the UVM resources (don't add any extra)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

runtime-rs: Resolve high UVM memory footprint
This is a port from b03db3e into runtime-rs

Rationale: This is a temporary solution for optimizing memory usage for
the current mechanism of requesting resources through pod Limit
annotations:
- if no Limits are specified and hence WorkloadMemMB is 0, set a default
  value 'StaticWorkloadDefaultMem' to allocate a default amount of
  memory for use for containers in the sandbox in addition to the base
  memory
- if Limits are specified, the base memory and the sum of Limits are
  allocated. The end user needs to be aware of the minimum memory
  requirements for their pods, otherwise the pod will be stuck in the
  ContainerCreating state

Testing: Manual testing, creating pods with Limits and without limits,
and with two containers where each container has a limit, tested with
integration in a SPEC file where the config variables were set via
environment variables via the make command

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

runtime-rs: improved memory overhead management

This is a port from 7ddec33 into runtime-rs

After these changes:

1. The value of the K8s runtime class memory overhead:
   - Covers the memory usage from all the Host-side components (mainly
     the Kata Shim and the VMM).
   - Doesn't include the memory usage from any Guest-side components.

2. The value of a pod memory limit specified by the user:
   - Is equal to the memory size of the Pod VM.
   - Includes the memory usage from all the Guest-side components
     (mainly user's workload, the Guest kernel, and the Kata Agent)
   - Doesn't include the memory usage from any Host-side components.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

runtime-rs: Allocate default workload vcpus

This is a port from 9af9844

Plus ports an existing behaviour from runtime-go to also add the vcpus. See
https://github.com/fidencio/kata-containers/blob/e2476f587c472d5d217df9c75cdb80193dd85994/src/runtime/pkg/oci/utils.go#L1232

- similar to the static_sandbox_default_workload_mem option,
  assign a default number of vcpus to the VM when no limits
  are given, 1 vcpu in this case
- similar to commit c7b8ee9, do not allocate additional vcpus when limits are provided

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

runtime-rs: add test coverage for static resource management

If using static management and initial size manager uses 0 for CPU or memory,
we add default static values to the hv config

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
- tests that deploy pods with too small of a memory limit
- tests try to set a minimum memory limit for some containerd tests
- tests that use runners we don't have
- tests that depend on pushing to GHCR
- disable Kata Containers CI / kata-containers-ci-on-push / run-kata-deploy-tests / run-kata-deploy-tests (qemu, k3s)

Also disable these for runtime-rs that fail due to resource management patches:
- run-nerdctl-tests (dragonball)
- run-nydus (active, dragonball)
- run-nydus (lts, dragonball)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
If memory limit is set and less than minimum, set it to minimum.

This is to to account for kata-containers@0ec3403

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Temporary patch to test rebase

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
@Redent0r Redent0r force-pushed the saul/msft-preview2 branch from 89a7d3f to 288cd70 Compare May 28, 2026 20:26
@Redent0r Redent0r requested a review from sprt May 29, 2026 16:25

@sprt sprt left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, understanding that we have one pending regression that will be prioritized right after this.

@Redent0r

Copy link
Copy Markdown
Author

LGTM, understanding that we have one pending regression that will be prioritized right after this.

https://dev.azure.com/mariner-org/container-runtime/_workitems/edit/20444

@danmihai1

Copy link
Copy Markdown

I would prefer to have commit SHA values next to description like

runtime: Remove unused VMM options for mem alloc

  • We only ever tested these fork changes with CLH+MSHV
  • Remove these options as we don't use QEMU/FC

Signed-off-by: Manuel Huber mahuber@microsoft.com

But, missing that information is not a big deal, if it's relatively difficult to include.

@danmihai1

danmihai1 commented Jun 1, 2026

Copy link
Copy Markdown

I would prefer to see in the commit description of the "webhook: enforce minimum memory limit" change a more detailed explanation of the goals for this change, what fails if we don't make the change, that the change is specific to testing, etc.

@Redent0r

Redent0r commented Jun 1, 2026

Copy link
Copy Markdown
Author

@danmihai1 I'll address these next rebase

@Redent0r Redent0r merged commit 7673219 into up/3.31.0 Jun 1, 2026
347 of 366 checks passed
@Redent0r Redent0r deleted the saul/msft-preview2 branch June 1, 2026 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants