Skip to content

runtime: Add VM Templating support for CLH#450

Open
harshitgupta1337 wants to merge 51 commits into
msft-previewfrom
guptaharshit/runtime-go-clh-templating
Open

runtime: Add VM Templating support for CLH#450
harshitgupta1337 wants to merge 51 commits into
msft-previewfrom
guptaharshit/runtime-go-clh-templating

Conversation

@harshitgupta1337

Copy link
Copy Markdown

This PR provides the implementation of the VM Templating mechanism for Cloud-Hypervisor sandboxes in the Go Runtime. The code changes are of three types, in the order of commits:

  • Add the implementation of key VM management functions for Cloud Hypervisor - Pause, Resume, Restore and Snapshot.
  • Add VM Templating logic for Cloud Hypervisor and update general templating flow to fix some bugs.
  • Add E2E test for VM Templating (only compatible with a Kubeadm K8s cluster on Ubuntu node)

Constraints:

  • VM Templating workflow is incompatible with the use of static_sandbox_resource_mgmt. Therefore static_sandbox_resource_mgmt = false should be set in the Kata configuration.
  • Testing has only been carried out using the EROFS snapshotter, as the replacement of OverlayFS.

Testing performed

  • End to End testing done on Ubuntu:
    • Ubuntu 24.04.4 LTS
    • Kubeadm K8s cluster
    • KVM hypervisor
    • Kernel version: 6.17.0-1013-azure
    • containerd v2.2.3
    • cloud-hypervisor main branch from upstream: commit 2b61bda35
    • mkfs.erofs (erofs-utils) 1.8.10
  • Micro benchmarking with containerd (no K8s) done on AzLinux 3.0:
    • AzL version 3.0.20260401
    • Kernel version: 6.6.121.mshv2
    • containerd v2.1.6
    • cloud-hypervisor msft-main-3.0 branch from LSG fork: commit b760c1b83
    • mkfs.erofs (erofs-utils) 1.8.5

sprt and others added 30 commits March 18, 2026 13:32
Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/43668151

Rationale: This is a temporary solution for optimizing memory usage for
the current mechanism of requesting resources through pod Limit
annotations:
- if no Limits are specified and hence WorkloadMemMB is 0, set a default
  value 'StaticWorkloadDefaultMem' to allocate a default amount of
  memory for use for containers in the sandbox in addition to the base
  memory
- if Limits are specified, the base memory and the sum of Limits are
  allocated. The end user needs to be aware of the minimum memory
  requirements for their pods, otherwise the pod will be stuck in the
  ContainerCreating state

Testing: Manual testing, creating pods with Limits and without limits,
and with two containers where each container has a limit, tested with
integration in a SPEC file where the config variables were set via
environment variables via the make command

Adapted by @mfrw from 3.1.0 to apply to 3.2.0

Signed-off-by: Muhammad Falak R Wani <mwani@microsoft.com>
Signed-off-by: Manuel Huber <mahuber@microsoft.com>

runtime: Remove unused VMM options for mem alloc

- We only ever tested these fork changes with CLH+MSHV
- Remove these options as we don't use QEMU/FC

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
This branch starts introducing additional scripting to build, deploy
and evaluate the components used in AKS' Pod Sandboxing and
Confidential Containers preview features. This includes the capability
to build the IGVM file and its reference measurement file for remote
attestation.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Improve igvm-builder and node-builder/azure-linux scripting

- Support for Mariner 3 builds using OS_VERSION variable
- Improvements to IGVM build process and flow as described in README
- Adoption of using only cloud-hypervisor-cvm on CBL-Mariner

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Add package-tools-install functionality

- Add script to install kata-containers(-cc)-tools bits
- Minor improvements in README.md
- Minor fix in package_install
- Remove echo outputs in package_build

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Enable setting IGVM SVN

- Allow setting SVN parameter for IGVM build scripting

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: introduce BUILD_TYPE variable

This lets developers build and deploy Kata in debug mode without having to make
manual edits to the build scripts.

With BUILD_TYPE=debug (default is release):

 * The agent is built in debug mode.
 * The agent is built with a permissive policy (using allow-all.rego).
 * The shim debug config file is used, ie. we create the symlink
   configuration-clh-snp-debug.toml <- configuration-clh-snp.toml.

For example, building and deploying Kata-CC in debug mode is now as simple as:

   make BUILD_TYPE=debug all-confpods deploy-confpods

Also do note that make still lets you override the other variables even after
setting BUILD_TYPE. For example, you can use the production shim config with
BUILD_TYPE=debug:

   make BUILD_TYPE=debug SHIM_USE_DEBUG_CONFIG=no all-confpods deploy-confpods

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: introduce SHIM_REDEPLOY_CONFIG

See README: when SHIM_REDEPLOY_CONFIG=no, the shim configuration is NOT
redeployed, so that potential config changes made directly on the host
during development aren't lost.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: Use img for Pod Sandboxing

Switch from UVM initrd to image format

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Adapt README instructions

- Sanitize containerd config snippet
- Set podOverhead for Kata runtime class

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Adapt AGENT_POLICY_FILE path

- Adapt path in uvm_build.sh script to comply
  with the usptream changes we pulled in

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Use Azure Linux 3 as default path

- update recipe and node-builder scripting
- change default value on rootfs-builder

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Deploy-only for AzL3 VMs

- split deployment sections in node-builder README.md
- install jq, curl dependencies within IGVM script
- add path parameter to UVM install script

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Minor updates to README.md

- no longer install make package, is part of meta package
- remove superfluous popd
- add note on permissive policy for ConfPods UVM builds

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Updates to README.md

- with the latest 3.2.0.azl4 package on PMC, can remove OS_VERSION parameter
  and use the make deploy calls instead of copying files by hand for variant
  I (now aligned with Variant II)
- with the latest changes on msft-main, set the podOverhead to 600Mi

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Fix SHIM_USE_DEBUG_CONFIG behavior

Using a symlink would create a cycle after calling this script again when
copying the final configuration at line 74 so we just use cp instead.

Also, I moved this block to the end of the file to properly override the final
config file.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: Build and install debug configuration for pod sandboxing

For ease of debugging, install a configuration-clh-debug.toml for pod
sandboxing as we do in Conf pods.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>

runtime: remove clh-snp config file usage in makefile

Not needed to build vanilla kata

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

package_tools_install.sh: include nsdax.gpl.c

Include nsdax.gpl.c

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

node-builder: fix typo in string comparison

This also fixes a shellcheck error and lets us require the
shellcheck-required job:

In ./tools/osbuilder/node-builder/azure-linux/uvm_build.sh line 34:
        if [ -z "${UVM_KERNEL_HEADER_DIR}}" ]; then
                                         ^-- SC2157 (error): Argument to -z is always false due to literal strings.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

docs: node-builder: fix static check error

This fixes the below static check error to follow up on the infra fix from
kata-containers#11646:

2025-07-31T19:32:45.0031829Z time="2025-07-31T19:32:44.990004665Z" level=fatal msg="found 2 parse errors:\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Set up environment\" (heading: {Name:Set up environment MDName:Set up environment LinkName:set-up-environment Level:2})\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Install build dependencies\" (heading: {Name:Install build dependencies MDName:Install build dependencies LinkName:install-build-dependencies Level:2})" commit=1d17f56b1aa7a880468b8e25d14467c92dca8eeb name=kata-check-markdown pid=9075 source=check-markdown version=0.0.1

Note: that is likely flagged because having two headings with the same
name, even under different sections, makes it impossible to create a
canonical heading link in Markdown.

This should eventually be squashed into the node-builder commit.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

docs: node-builder: Remove references to moby-containerd-cc

As we adopted containerd2, we remove references to our prior
forked containerd version.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: 2Mb aligned guest image size

Build the mariner guest image using IMAGE_SIZE_ALIGNMENT_MB=2.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>

to-squash: node-builder: add reference to README.md

This is needed to avoid the following static-checks error:

2025-08-05T21:27:20.0028337Z [static-checks.sh:808] ERROR: Document tools/osbuilder/node-builder/azure-linux/README.md is not referenced

This commit is to be squashed into the node-builder commit.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
After these changes:

1. The value of the K8s runtime class memory overhead:
   - Covers the memory usage from all the Host-side components (mainly
     the Kata Shim and the VMM).
   - Doesn't include the memory usage from any Guest-side components.

2. The value of a pod memory limit specified by the user:
   - Is equal to the memory size of the Pod VM.
   - Includes the memory usage from all the Guest-side components
     (mainly user's workload, the Guest kernel, and the Kata Agent)
   - Doesn't include the memory usage from any Host-side components.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>

runtime: fix `make test`

This addresses the following errors from `make test` to allow us to require
that upstream CI:

https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142422035?pr=392#step:13:53

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
- similar to the static_sandbox_default_workload_mem option,
  assign a default number of vcpus to the VM when no limits
  are given, 1 vcpu in this case
- similar to commit c7b8ee9, do not allocate additional vcpus
  when limits are provided

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
Point to msft-preview

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
For our Kata UVM, we know we need at least 128MB of memory to prevent instability in the guest.

Enforce this constraint with a descriptive error to prevent users from destabilizing the UVM with faulty k8s configurations.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
If memory limit is set and less than minimum, set it to minimum.

This is to to account for kata-containers@0ec3403

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Add Microsoft mandatory file SECURITY.md

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
- Change Makefile to point to fork
- Change versions.yaml to point to proper version on fork

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This change mirrors host networking into the guest as before, but now also
includes the default gateway neighbor entry for each interface.

Pods using overlay/synthetic gateways (e.g., 169.254.1.1) can hit a
first-connect race while the guest performs the initial ARP. Preseeding the
gateway neighbor removes that latency and makes early connections (e.g.,
to the API Service) deterministic.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This is a fork temporary measure to unblock CI required tests in our fork,
while we find a way to remove the 'main' hard codes from upstream.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Background:

 * `pull_request` runs on the PR branch code and has access to secrets
   ONLY if the PR is from microsoft/kata-containers (i.e. NOT from an external
   contributor who forked the repo).
 * `pull_request_target` runs on the trusted main branch code by default
   and has access to secrets for any PR.

Reference: https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows#pull_request

Upstream uses `pull_request_target` (and manually checks out the PR code)
to have access to secrets for PRs from external contributors, however we
don't expect external PRs, hence we can use `pull_request`.

Furthermore, since `pull_request_target` only runs from the default branch,
we need to use `pull_request` anyway as we have multiple leading branches
(i.e., msft-main, msft-preview, and release branches).

https://github.blog/changelog/2025-11-07-actions-pull_request_target-and-environment-branch-protections-changes/

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
set default to msft-preview

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
use upstream cloud-hypervisor.

This is to unblock the CI and let CLH build

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
update target branch to msft-preview

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This fixes a CI static check failure

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
- tests that deploy pods with too small of a memory limit
 - try to set a minimum memory limit for some containerd tests
- tests that use runners we don't have
- tests that depend on pushing to GHCR

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Enable VFIO device pass-through at VM creation time on Cloud Hypervisor,
in addition to the existing hot-plug path.

Signed-off-by: Roaa Sakr <romoh@microsoft.com>
Regenerate CH client against v51.1

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
The recently-added nested property is true by default, but is not
supported yet on MSHV.

See cloud-hypervisor/cloud-hypervisor#7408 for additional information.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
This cloud-hypervisor is a directory, so it needs "rm -rf" instead of
"rm -f".

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
disable Kata Containers CI / kata-containers-ci-on-push / run-kata-deploy-tests / run-kata-deploy-tests (qemu, k3s)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
`cargo check` was introduced in 3f1533a to check that Cargo.lock is in sync
with Cargo.toml. However, if there are uncommitted changes in the working
tree, the current invocation will immediately fail because of the `git
diff` call, which is frustrating for local development.

As it turns out, `cargo clippy` is a superset of `cargo check`, so we can
simply pass `--locked` to `cargo clippy` to detect Cargo.lock issues.

This is tested with the following change:

diff --git a/src/agent/Cargo.lock b/src/agent/Cargo.lock
index 96b6c67..e1963af 100644
--- a/src/agent/Cargo.lock
+++ b/src/agent/Cargo.lock
@@ -4305,6 +4305,7 @@ checksum = "8f50febec83f5ee1df3015341d8bd429f2d1cc62bcba7ea2076759d315084683"
 name = "test-utils"
 version = "0.1.0"
 dependencies = [
- "libc",
  "nix 0.26.4",
 ]

which results in the following output:

$ make -C src/agent check
make: Entering directory '/kata-containers/src/agent'
standard rust check...
cargo fmt -- --check
cargo clippy --all-targets --all-features --release --locked \
        -- \
        -D warnings
error: the lock file /kata-containers/src/agent/Cargo.lock needs to be updated but --locked was passed to prevent this
If you want to try to generate the lock file without accessing the network, remove the --locked flag and use --offline instead.
make: *** [../../utils.mk:184: standard_rust_check] Error 101
make: Leaving directory '/kata-containers/src/agent'

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Build and install both runtime-rs and runtime-go configs and binaries side by side:
  - runtime-go:
      /usr/local/bin/containerd-shim-kata-v2-go
      /usr/local/share/defaults/kata-containers/configuration-clh.toml
      /usr/local/share/defaults/kata-containers/configuration-clh-debug.toml

  - runtime-rs:
      /usr/local/bin/containerd-shim-kata-v2-rs
      /usr/local/share/defaults/kata-containers/configuration-cloud-hypervisor.toml
      /usr/local/share/defaults/kata-containers/configuration-cloud-hypervisor-debug.toml

Also add USE_RUNTIME_RS variable and default to "yes". This controls which runtime binary and configuration will be installed
to /usr/local/bin/containerd-shim-kata-v2 and /usr/local/share/defaults/kata-containers/configuration.toml respectively.

Also install kata-ctl (runtime-rs equivalent of kata-runtime) so we can exec into the UVM when using runtime-rs

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This is a port from b03db3e into runtime-rs

Rationale: This is a temporary solution for optimizing memory usage for
the current mechanism of requesting resources through pod Limit
annotations:
- if no Limits are specified and hence WorkloadMemMB is 0, set a default
  value 'StaticWorkloadDefaultMem' to allocate a default amount of
  memory for use for containers in the sandbox in addition to the base
  memory
- if Limits are specified, the base memory and the sum of Limits are
  allocated. The end user needs to be aware of the minimum memory
  requirements for their pods, otherwise the pod will be stuck in the
  ContainerCreating state

Testing: Manual testing, creating pods with Limits and without limits,
and with two containers where each container has a limit, tested with
integration in a SPEC file where the config variables were set via
environment variables via the make command

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This is a port from 7ddec33 into runtime-rs

After these changes:

1. The value of the K8s runtime class memory overhead:
   - Covers the memory usage from all the Host-side components (mainly
     the Kata Shim and the VMM).
   - Doesn't include the memory usage from any Guest-side components.

2. The value of a pod memory limit specified by the user:
   - Is equal to the memory size of the Pod VM.
   - Includes the memory usage from all the Guest-side components
     (mainly user's workload, the Guest kernel, and the Kata Agent)
   - Doesn't include the memory usage from any Host-side components.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This is a port from 9af9844

Plus ports an existing behaviour from runtime-go to also add the vcpus. See
https://github.com/fidencio/kata-containers/blob/e2476f587c472d5d217df9c75cdb80193dd85994/src/runtime/pkg/oci/utils.go#L1232

- similar to the static_sandbox_default_workload_mem option,
  assign a default number of vcpus to the VM when no limits
  are given, 1 vcpu in this case
- similar to commit c7b8ee9, do not allocate additional vcpus when limits are provided

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This is a runtime-rs port for kata-containers@7973e4e

The recently-added nested property is true by default, but is not
supported yet on MSHV.

See cloud-hypervisor/cloud-hypervisor#7408 for additional information.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
upstream cherry-pick of 9404104

Enable the following optimizations when building runtime-rs in release mode:
- lto: true
- codegen-units=1:

Setting these reduce the binary size and improve performance at the cost of longer build times.

Without these flags:
- build time: 4m 55s
- binary size: 51 MB

With these flags:
- build time: 7m 21s
- binary size: 38MB

Per kata-containers#1125 and local experiments,
a smaller binary size leads to a smaller shim memory footprint.

- https://nnethercote.github.io/perf-book/build-configuration.html#codegen-units
- https://nnethercote.github.io/perf-book/build-configuration.html#link-time-optimization

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
fidencio and others added 19 commits May 18, 2026 15:45
InitialSizeManager::setup_config() is responsible for applying the
sandbox workload sizing (computed from containerd/CRI-O sandbox
annotations) to the hypervisor configuration before VM creation.

Previously, the workload vCPU count was only logged but never actually
added to default_vcpus, so the VM was always created with only the base
vCPUs from the configuration/annotations. This caused the
k8s-sandbox-vcpus-allocation test to fail with qemu-snp-runtime-rs:
a pod with default_vcpus=0.75 and a container CPU limit of 1.2 should
see ceil(0.75 + 1.2) = 2 vCPUs, but only got 1.

Additionally, the workload memory was being added to default_memory
unconditionally, diverging from the Go runtime which only applies both
CPU and memory additions when static_sandbox_resource_mgmt is enabled.
In the non-static path, adding workload resources here would cause
double-counting: once from setup_config() at sandbox creation, and
again from update_cpu_resources()/update_mem_resources() when
individual containers are added.

Guard both additions behind static_sandbox_resource_mgmt, matching the
Go runtime's behavior in src/runtime/pkg/oci/utils.go:

    if sandboxConfig.StaticResourceMgmt {
        sandboxConfig.HypervisorConfig.NumVCPUsF += sandboxConfig.SandboxResources.WorkloadCPUs
        sandboxConfig.HypervisorConfig.MemorySize += sandboxConfig.SandboxResources.WorkloadMemMB
    }

Fixes: k8s-sandbox-vcpus-allocation test failure on qemu-snp-runtime-rs

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
If using static management and initial size manager uses 0 for CPU or memory,
we add default static values to the hv config

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
based on current runtime-go behaviour introduced in kata-containers#9195

When using static resources, always set maxvcpus value equal to the vcpus value.
This is because the static resources case does not support dynamic CPU hotplugging,
and therefore the maximum number of vCPUs should be limited to the number of vCPUs.
Booting with a high number of max vCPUs is a bit slower compared to a lower number.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Due to importing resource management patches to runtime-rs, these tests:
- run-nerdctl-tests (dragonball)
- run-nydus (active, dragonball)
- run-nydus (lts, dragonball)

Are failing with: vmm action error: MachineConfig(InvalidVcpuCount(0))

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
The CLH impl of Hypervisor iface was missing Save and Pause
functions that are crucial for creating a template.
This commit implements those functions and snapshot is getting saved
when the template initialization finishes.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Although the official Kata docs require initrd image with VM templating.
However, even with a root disk image the overlay upper layer is created in memory.
Which will be captured in the VM snapshot. Therefore, the initrd image constraint
does not apply for VM templating.

Update the test TestCheckFactoryConfig to not expect error when template is enabled
and RootFS image is specified.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Create the VM Template's memory file as an empty file with size equal to that
of the VM's memory size.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
When VM is booted to be template or booted from template, its memory is to be backed
by a file. This commit updates the memory config to use a file in both cases.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
The template VM is created with the default value for VMStorePath,
causing the runtime to be unable to reach the CLH VM's API socket.
This commit sets the VMStorePath to be equal to the VM's statePath,
which is set to the `factory.template_path` config parameter.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Make deviceStatePath calculation in VM template workflow configurable
based on hypervisor, instead of hardcoding it to `state`.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Expand the scope of the resetHypervisorConfig function to include resetting
sandbox name, namespace and the default max vcpus.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Update config.json for the CLH VM template to set memory shared=false.
This forces the VMs created from the template to trigger CoW.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Add logic to restore VM from snapshot instead of starting new VM when
VM template is enabled. Copy the config.json and state.json to the new
VM's VmStorePath, and update the config to create the VSOCK device for
Kata agent in the VmStorePath.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Use diff storage paths for diff VMs created from the same template.
Add comments to the test.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
Implement the first draft of the VM Templating integration K8s tests.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
@harshitgupta1337 harshitgupta1337 marked this pull request as ready for review May 22, 2026 20:49
Copilot AI review requested due to automatic review settings May 22, 2026 20:49

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds VM templating (factory) support for Cloud Hypervisor sandboxes in the Go runtime, including the Cloud Hypervisor pause/resume/snapshot/restore plumbing, template factory adjustments, and an end-to-end Kubernetes integration test to validate the workflow.

Changes:

  • Extend Cloud Hypervisor runtime support with pause/resume and snapshot/restore operations, and add a restore-from-template flow during VM start.
  • Update the template factory to handle CLH state file naming (state.json) and to size the template memory backing file to the VM’s configured memory.
  • Add a Kubernetes integration BATS test for VM templating and loosen a config validation restriction around initrd when templating is enabled.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/integration/kubernetes/run_kubernetes_tests.sh Adds the new VM templating BATS test to the Kubernetes integration test suite.
tests/integration/kubernetes/k8s-vm-templating-test.bats New end-to-end Kubernetes test that enables templating on nodes, initializes templates, and validates pod creation.
src/runtime/virtcontainers/factory/template/template_test.go Expands unit tests to cover per-VM storage paths and CLH/QEMU state-file differences.
src/runtime/virtcontainers/factory/template/template_linux.go Adds CLH-aware device state file naming and truncates the template memory file to the VM memory size.
src/runtime/virtcontainers/factory/factory_linux.go Resets additional hypervisor config fields during factory config comparison to avoid false mismatches.
src/runtime/virtcontainers/clh.go Implements CLH pause/resume/snapshot/restore APIs and adds restore-from-template logic at VM start.
src/runtime/virtcontainers/clh_test.go Adds unit tests for CLH snapshot and restore flows and extends the client mock accordingly.
src/runtime/pkg/katautils/config.go Removes the “templating requires initrd” factory config restriction.
src/runtime/pkg/katautils/config_test.go Updates tests to match the relaxed factory config validation behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/runtime/virtcontainers/clh.go Outdated
Comment on lines +770 to +780
// shouldRestoreFromTemplate checks if template snapshot files exist and we should restore instead of creating new VM
func (clh *cloudHypervisor) shouldRestoreFromTemplate() bool {
// For template restore, we need the snapshot directory to contain the necessary files
// The snapshotDir is derived from the MemoryPath directory
snapshotDir := filepath.Dir(clh.config.MemoryPath)

// Check for required template files (config.json and memory file)
configFile := filepath.Join(snapshotDir, "config.json")
memoryFile := clh.config.MemoryPath

if _, err := os.Stat(configFile); os.IsNotExist(err) {
assert.Contains(err.Error(), filepath.Join(clhConfig.VMStorePath, "state.json"))

// Now create the VM snapshot files and call restoreVM again.
os.MkdirAll(clhConfig.VMStorePath, os.ModePerm)
Comment on lines +46 to +49
exec_host "$n" "sudo sed -i -e 's|^#\\?enable_template[[:space:]]*=.*$|enable_template = true|g' -e 's|^#\\?template_path[[:space:]]*=.*$|template_path = \"/run/vc/vm/template\"|g' -e 's|^#\\?shared_fs[[:space:]]*=.*$|shared_fs = \"none\"|g' '${config_file}'" || die "Failed to update kata config on node $n"
exec_host "$n" "sudo grep -q '^enable_template[[:space:]]*=' '${config_file}' || echo 'enable_template = true' | sudo tee -a '${config_file}' >/dev/null" || die "Failed to set enable_template on node $n"
exec_host "$n" "sudo grep -q '^template_path[[:space:]]*=' '${config_file}' || echo 'template_path = \"/run/vc/vm/template\"' | sudo tee -a '${config_file}' >/dev/null" || die "Failed to set template_path on node $n"
exec_host "$n" "sudo grep -q '^shared_fs[[:space:]]*=' '${config_file}' || echo 'shared_fs = \"none\"' | sudo tee -a '${config_file}' >/dev/null" || die "Failed to set shared_fs on node $n"
Comment on lines +72 to +81
@test "Pod can be created with templated VM" {
pod_name="test-templated-pod"
ctr_name="test-container"

pod_config=$(mktemp --tmpdir pod_config.XXXXXX.yaml)
cp "$pod_config_dir/busybox-template.yaml" "$pod_config"

sed -i "s/POD_NAME/$pod_name/" "$pod_config"
sed -i "s/CTR_NAME/$ctr_name/" "$pod_config"

Copilot AI review requested due to automatic review settings June 10, 2026 21:52
@Camelron Camelron force-pushed the guptaharshit/runtime-go-clh-templating branch from 7c3ad7a to d0c37a1 Compare June 10, 2026 22:00

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 104 out of 105 changed files in this pull request and generated 19 comments.

Comment on lines +79 to +83
minMemoryLimit, foundMinMemoryLimit := os.LookupEnv("MIN_MEMORY_LIMIT")

if foundMinMemoryLimit {
minMemoryLimitVal := resource.MustParse(minMemoryLimit)
for i := range pod.Spec.Containers {
Comment on lines +83 to +88
for i := range pod.Spec.Containers {
if pod.Spec.Containers[i].Resources.Limits == nil {
continue
} else {
currentMemoryLimit := pod.Spec.Containers[i].Resources.Limits.Memory().Value()
if currentMemoryLimit < minMemoryLimitVal.Value() {
Comment thread tools/osbuilder/Makefile
Comment on lines 96 to 99
.PHONY: all
all: image initrd
all: image initrd igvm

rootfs-%: $(ROOTFS_BUILD_DEST)/.%$(ROOTFS_MARKER_SUFFIX)
Comment on lines +13 to +27
SCRIPT_DIR="$(dirname $(readlink -f $0))"

# distro-specific config file
typeset -r CONFIG_SH="config.sh"

# Name of an optional distro-specific file which, if it exists, must implement the
# install_igvm_tool, build_igvm_files, and uninstall_igvm_tool functions.
typeset -r LIB_SH="igvm_lib.sh"

load_config_distro()
{
distro_config_dir="${SCRIPT_DIR}/${DISTRO}"

[ -d "${distro_config_dir}" ] || die "Could not find configuration directory '${distro_config_dir}'"

Comment on lines +38 to +52
echo "Reading Kata image dm_verity root hash information from root_hash file"
ROOT_HASH_FILE="${SCRIPT_DIR}/../root_hash.txt"

if [ ! -f "${ROOT_HASH_FILE}" ]; then
echo "Could no find image root hash file '${ROOT_HASH_FILE}', aborting"
exit 1
fi

IMAGE_ROOT_HASH=$(sed -e 's/Root hash:\s*//g;t;d' "${ROOT_HASH_FILE}")
IMAGE_SALT=$(sed -e 's/Salt:\s*//g;t;d' "${ROOT_HASH_FILE}")
IMAGE_DATA_BLOCKS=$(sed -e 's/Data blocks:\s*//g;t;d' "${ROOT_HASH_FILE}")
IMAGE_DATA_BLOCK_SIZE=$(sed -e 's/Data block size:\s*//g;t;d' "${ROOT_HASH_FILE}")
IMAGE_DATA_SECTORS_PER_BLOCK=$((IMAGE_DATA_BLOCK_SIZE / 512))
IMAGE_DATA_SECTORS=$((IMAGE_DATA_BLOCKS * IMAGE_DATA_SECTORS_PER_BLOCK))
IMAGE_HASH_BLOCK_SIZE=$(sed -e 's/Hash block size:\s*//g;t;d' "${ROOT_HASH_FILE}")
Comment on lines +27 to +33
DEFAULT_HYPERVISOR=cloud-hypervisor \
DEFMEMSZ=0 \
DEFSTATICSANDBOXWORKLOADMEM=512 \
DEFVCPUS=0 \
DEFSTATICSANDBOXWORKLOADVCPUS=1 \
DEFVIRTIOFSDAEMON=${VIRTIOFSD_BINARY_LOCATION} \
PREFIX=${INSTALL_PATH_PREFIX}"
Comment on lines +42 to +47
DEFVIRTIOFSDAEMON=${VIRTIOFSD_BINARY_LOCATION} \
PREFIX=${INSTALL_PATH_PREFIX} \
DEFMEMSZ=0 \
DEFSTATICSANDBOXWORKLOADMEM=512 \
DEFVCPUS=0 \
DEFSTATICSANDBOXWORKLOADVCPUS=1"
Comment on lines 111 to 114
env:
GOPATH: ${{ github.workspace }}
target_branch: msft-preview
permissions:
```shell
go get github.com/stretchr/testify/assert
go get golang.org/x/oauth2
go get golang.org/x/net/context
Comment on lines 22 to 24
- name: pod-annotate-webhook
image: quay.io/kata-containers/kata-webhook-example:latest
image: marineraks.azurecr.io/kata-containers/kata-webhook:min_memory_limit
imagePullPolicy: Always
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants