wip: runtime-rs: Implement PoC for OpenVMM#438
Conversation
There was a problem hiding this comment.
Pull request overview
Implements an initial (PoC/WIP) OpenVMM hypervisor backend for runtime-rs, wiring it into the virt_container runtime and Kata config system, and updating build/test tooling to support the new feature set.
Changes:
- Add an
openvmmhypervisor implementation inruntime-rs/crates/hypervisor, plus feature flags and runtime registration. - Introduce an OpenVMM runtime configuration template and Makefile logic to enable/disable OpenVMM builds and tests.
- Update Rust toolchain/version pins and adjust lint/test invocations to accommodate dependency constraints.
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| versions.yaml | Bumps Rust version metadata to 1.94. |
| utils.mk | Adjusts standard Rust checks (adds --locked, removes cargo check + diff guard). |
| src/runtime-rs/crates/runtimes/virt_container/src/sandbox.rs | Adds OpenVMM to sandbox persist/restore hypervisor dispatch. |
| src/runtime-rs/crates/runtimes/virt_container/src/lib.rs | Registers OpenVMM config plugin and enables OpenVMM hypervisor creation behind feature flag. |
| src/runtime-rs/crates/runtimes/virt_container/Cargo.toml | Adds openvmm feature wiring to hypervisor crate. |
| src/runtime-rs/crates/runtimes/Cargo.toml | Plumbs openvmm feature up to runtimes workspace feature. |
| src/runtime-rs/crates/hypervisor/src/openvmm/vmm_instance.rs | New: wraps OpenVMM in-process worker lifecycle + RPC control. |
| src/runtime-rs/crates/hypervisor/src/openvmm/mod.rs | New: OpenVMM Hypervisor + Persist implementation and API surface. |
| src/runtime-rs/crates/hypervisor/src/openvmm/inner_hypervisor.rs | New: OpenVMM VM prepare/start/stop and device wiring into OpenVMM config. |
| src/runtime-rs/crates/hypervisor/src/openvmm/inner_device.rs | New: OpenVMM device management stubs + pending-device queueing. |
| src/runtime-rs/crates/hypervisor/src/openvmm/inner.rs | New: OpenVMM inner state + save/restore state implementation. |
| src/runtime-rs/crates/hypervisor/src/lib.rs | Exposes OpenVMM module + re-exports OpenVMM hypervisor name behind feature. |
| src/runtime-rs/crates/hypervisor/src/ch/inner_hypervisor.rs | Minor refactor to avoid unwrap() in netns logging. |
| src/runtime-rs/crates/hypervisor/Cargo.toml | Adds OpenVMM optional dependencies + openvmm feature list. |
| src/runtime-rs/crates/agent/src/sock/hybrid_vsock.rs | Changes HybridVsock handshake command casing (connect → CONNECT). |
| src/runtime-rs/config/configuration-openvmm.toml.in | New: OpenVMM config template. |
| src/runtime-rs/config/configuration-cloud-hypervisor.toml.in | Changes CLH defaults to enable_debug = true in multiple sections. |
| src/runtime-rs/Makefile | Adds OpenVMM build toggles, test excludes, feature handling, and modifies build flags + checks. |
| src/runtime-rs/Cargo.toml | Adds top-level openvmm feature. |
| src/libs/kata-types/src/config/mod.rs | Exports OpenVMM config types and hypervisor name constant. |
| src/libs/kata-types/src/config/hypervisor/openvmm.rs | New: OpenVMM config plugin (adjust/validate). |
| src/libs/kata-types/src/config/hypervisor/mod.rs | Registers OpenVMM config module/exports. |
| src/libs/kata-sys-util/src/protection.rs | Removes unsafe wrappers around __cpuid calls. |
| src/libs/kata-sys-util/src/mount.rs | Uses NixPath::is_empty() for path emptiness checks. |
| src/dragonball/dbs_arch/src/x86_64/cpuid/common.rs | Removes unsafe wrappers around __get_cpuid_max/__cpuid_count. |
| src/dragonball/dbs_arch/src/x86_64/cpuid/brand_string.rs | Removes unsafe wrappers around host_cpuid calls. |
| src/dragonball/dbs_allocator/src/interval_tree.rs | Refactors option handling to avoid unwrap() patterns. |
| rust-toolchain.toml | Bumps pinned Rust toolchain channel to 1.94. |
| Cargo.toml | Adds OpenVMM path-based workspace dependencies and a global bitvec crates.io patch override. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let subscriber = tracing_subscriber::fmt() | ||
| .with_writer(std::sync::Mutex::new(file)) | ||
| .with_ansi(false) | ||
| .finish(); | ||
| // Use set_default (thread-local) not set_global_default | ||
| let _guard = tracing::subscriber::set_default(subscriber); | ||
| } |
There was a problem hiding this comment.
set_default() returns a guard that must be kept alive for as long as you want the subscriber to be active. Right now _guard is dropped at the end of the if let Some(dir) block, so tracing is effectively disabled for the rest of the worker-host thread. Keep the guard in a variable that lives for the whole thread (or until shutdown), or use a scoped block that covers the entire thread body.
| let _ = result_tx.send(result.context("failed to launch VM worker")); | ||
|
|
||
| // Keep the pool alive for the VM's lifetime. | ||
| std::future::pending::<()>().await; | ||
| }); |
There was a problem hiding this comment.
The worker-host thread is forced to live forever via std::future::pending().await, but stop() only stops/joins the WorkerHandle and never signals this thread to exit or gets joined itself. This will leak a thread (and the pal_async pool) per VM lifecycle and can accumulate over time. Consider adding an explicit shutdown mechanism for the pool/runner (e.g., cancellation channel), store the thread JoinHandle, and join it during stop().
| # | ||
| # Default false | ||
| enable_debug = false | ||
| enable_debug = true |
There was a problem hiding this comment.
This flips Cloud Hypervisor's default enable_debug to true (comment above still says "Default false"). Since other hypervisor templates default this to false, this is a behavior/logging change that can increase log volume and impact performance/operational noise. If this is only needed for the OpenVMM PoC, consider keeping CLH defaults unchanged or gate it behind a build-time variable.
| enable_debug = true | |
| enable_debug = false |
| # If enabled, make the agent display debug-level messages. | ||
| # (default: disabled) | ||
| enable_debug = false | ||
| enable_debug = true |
There was a problem hiding this comment.
This flips Cloud Hypervisor agent enable_debug to true by default (other templates default to false). If this isn't intentional for CLH generally, revert to false or make it configurable via a template variable so enabling debug remains an explicit choice.
| enable_debug = true | |
| enable_debug = false |
| # system log | ||
| # (default: disabled) | ||
| enable_debug = false | ||
| enable_debug = true |
There was a problem hiding this comment.
This flips Cloud Hypervisor runtime enable_debug to true by default (other templates default to false). Unless CLH specifically needs debug always on, consider reverting or gating to avoid increased log volume and potential performance impact in production deployments.
| enable_debug = true | |
| enable_debug = false |
|
|
||
| const KATA_PATH: &str = "/run/kata"; | ||
|
|
There was a problem hiding this comment.
OpenVMM introduces a local KATA_PATH constant hardcoded to /run/kata. Other hypervisors use kata_types::config::KATA_PATH, which centralizes this value and avoids divergence. Consider reusing the shared constant instead of duplicating it here.
| const KATA_PATH: &str = "/run/kata"; | |
| use kata_types::config::KATA_PATH; |
| @@ -1,3 +1,3 @@ | |||
| [toolchain] | |||
| # Keep in sync with versions.yaml | |||
There was a problem hiding this comment.
This bumps the pinned toolchain to 1.94, but the workspace Cargo.toml still declares rust-version = "1.88" (MSRV). If 1.94 is now the project minimum, update rust-version accordingly; if 1.88 remains the MSRV, consider clarifying the intent (toolchain pin vs MSRV) to avoid confusion and CI inconsistencies.
| # Keep in sync with versions.yaml | |
| # Keep in sync with versions.yaml. | |
| # Note: This pinned CI toolchain is not the MSRV; the MSRV is defined | |
| # in the workspace Cargo.toml (currently `rust-version = "1.88"`). |
| virt_mshv = { path = "../openvmm-repo/vmm_core/virt_mshv" } | ||
| openvmm_core = { path = "../openvmm-repo/openvmm/openvmm_core" } | ||
| openvmm_defs = { path = "../openvmm-repo/openvmm/openvmm_defs" } | ||
| openvmm_resources = { path = "../openvmm-repo/openvmm/openvmm_resources", features = ["virt_mshv", "net_tap"] } | ||
| openvmm_helpers = { path = "../openvmm-repo/openvmm/openvmm_helpers" } | ||
| vm_manifest_builder = { path = "../openvmm-repo/vmm_core/vm_manifest_builder" } | ||
| ovmm_mesh = { path = "../openvmm-repo/support/mesh", package = "mesh" } | ||
| ovmm_mesh_worker = { path = "../openvmm-repo/support/mesh/mesh_worker", package = "mesh_worker" } | ||
| ovmm_pal_async = { path = "../openvmm-repo/support/pal/pal_async", package = "pal_async" } | ||
| ovmm_unix_socket = { path = "../openvmm-repo/support/unix_socket", package = "unix_socket", features = ["mesh"] } | ||
| storvsp_resources = { path = "../openvmm-repo/vm/devices/storage/storvsp_resources" } | ||
| scsidisk_resources = { path = "../openvmm-repo/vm/devices/storage/scsidisk_resources" } | ||
| netvsp_resources = { path = "../openvmm-repo/vm/devices/net/netvsp_resources" } | ||
| net_backend_resources = { path = "../openvmm-repo/vm/devices/net/net_backend_resources" } | ||
| virtio_resources = { path = "../openvmm-repo/vm/devices/virtio/virtio_resources" } | ||
| vm_resource = { path = "../openvmm-repo/vm/vmcore/vm_resource" } | ||
| disk_backend_resources = { path = "../openvmm-repo/vm/devices/storage/disk_backend_resources" } | ||
| ovmm_vmm_core_defs = { path = "../openvmm-repo/vmm_core/vmm_core_defs", package = "vmm_core_defs" } | ||
| ovmm_guid = { path = "../openvmm-repo/support/guid", package = "guid" } | ||
| ovmm_memory_range = { path = "../openvmm-repo/vm/vmcore/memory_range", package = "memory_range" } |
There was a problem hiding this comment.
These OpenVMM workspace dependencies point to ../openvmm-repo/... outside this repository. That makes builds non-reproducible and will fail in CI/developer setups unless that exact sibling checkout exists. Prefer using a git dependency (with a pinned rev), vendoring, or adding the external repo as a submodule/path inside this repo (and gating resolution behind a feature if possible).
| virt_mshv = { path = "../openvmm-repo/vmm_core/virt_mshv" } | |
| openvmm_core = { path = "../openvmm-repo/openvmm/openvmm_core" } | |
| openvmm_defs = { path = "../openvmm-repo/openvmm/openvmm_defs" } | |
| openvmm_resources = { path = "../openvmm-repo/openvmm/openvmm_resources", features = ["virt_mshv", "net_tap"] } | |
| openvmm_helpers = { path = "../openvmm-repo/openvmm/openvmm_helpers" } | |
| vm_manifest_builder = { path = "../openvmm-repo/vmm_core/vm_manifest_builder" } | |
| ovmm_mesh = { path = "../openvmm-repo/support/mesh", package = "mesh" } | |
| ovmm_mesh_worker = { path = "../openvmm-repo/support/mesh/mesh_worker", package = "mesh_worker" } | |
| ovmm_pal_async = { path = "../openvmm-repo/support/pal/pal_async", package = "pal_async" } | |
| ovmm_unix_socket = { path = "../openvmm-repo/support/unix_socket", package = "unix_socket", features = ["mesh"] } | |
| storvsp_resources = { path = "../openvmm-repo/vm/devices/storage/storvsp_resources" } | |
| scsidisk_resources = { path = "../openvmm-repo/vm/devices/storage/scsidisk_resources" } | |
| netvsp_resources = { path = "../openvmm-repo/vm/devices/net/netvsp_resources" } | |
| net_backend_resources = { path = "../openvmm-repo/vm/devices/net/net_backend_resources" } | |
| virtio_resources = { path = "../openvmm-repo/vm/devices/virtio/virtio_resources" } | |
| vm_resource = { path = "../openvmm-repo/vm/vmcore/vm_resource" } | |
| disk_backend_resources = { path = "../openvmm-repo/vm/devices/storage/disk_backend_resources" } | |
| ovmm_vmm_core_defs = { path = "../openvmm-repo/vmm_core/vmm_core_defs", package = "vmm_core_defs" } | |
| ovmm_guid = { path = "../openvmm-repo/support/guid", package = "guid" } | |
| ovmm_memory_range = { path = "../openvmm-repo/vm/vmcore/memory_range", package = "memory_range" } | |
| virt_mshv = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "virt_mshv" } | |
| openvmm_core = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "openvmm_core" } | |
| openvmm_defs = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "openvmm_defs" } | |
| openvmm_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "openvmm_resources", features = ["virt_mshv", "net_tap"] } | |
| openvmm_helpers = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "openvmm_helpers" } | |
| vm_manifest_builder = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "vm_manifest_builder" } | |
| ovmm_mesh = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "mesh" } | |
| ovmm_mesh_worker = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "mesh_worker" } | |
| ovmm_pal_async = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "pal_async" } | |
| ovmm_unix_socket = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "unix_socket", features = ["mesh"] } | |
| storvsp_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "storvsp_resources" } | |
| scsidisk_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "scsidisk_resources" } | |
| netvsp_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "netvsp_resources" } | |
| net_backend_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "net_backend_resources" } | |
| virtio_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "virtio_resources" } | |
| vm_resource = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "vm_resource" } | |
| disk_backend_resources = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "disk_backend_resources" } | |
| ovmm_vmm_core_defs = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "vmm_core_defs" } | |
| ovmm_guid = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "guid" } | |
| ovmm_memory_range = { git = "https://github.com/microsoft/openvmm", rev = "PUT_REAL_OPENVMM_COMMIT_HASH_HERE", package = "memory_range" } |
| # Pending <https://github.com/ferrilab/bitvec/pull/273> | ||
| bitvec = { git = "https://github.com/smalis-msft/bitvec", branch = "set-aliased-previous-val" } |
There was a problem hiding this comment.
[patch.crates-io] globally overrides bitvec for the entire workspace to a fork/branch. This can unexpectedly affect non-OpenVMM builds and complicates supply-chain/auditability. If this patch is truly required, consider pinning to an immutable commit SHA, documenting the rationale, and/or isolating the override (e.g., via a forked crate name or upstreaming) rather than globally patching crates.io.
| # Pending <https://github.com/ferrilab/bitvec/pull/273> | |
| bitvec = { git = "https://github.com/smalis-msft/bitvec", branch = "set-aliased-previous-val" } | |
| # Pending <https://github.com/ferrilab/bitvec/pull/273>; pinned to a specific commit for reproducibility/auditability | |
| bitvec = { git = "https://github.com/smalis-msft/bitvec", rev = "0123456789abcdef0123456789abcdef01234567" } |
|
|
||
| $(TARGET_PATH): $(SOURCES) | show-summary | ||
| @RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES) | ||
| @RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings --allow deprecated" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES) |
There was a problem hiding this comment.
Adding --allow deprecated while also using --deny warnings effectively turns off the deprecation warnings across the whole runtime build, which can hide real maintenance issues unrelated to OpenVMM. Consider scoping this relaxation to just the OpenVMM feature/build path (or addressing the deprecations in dependencies) so other configurations keep strict warning hygiene.
| @RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings --allow deprecated" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES) | |
| @RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 13 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Pending <https://github.com/ferrilab/bitvec/pull/273> | ||
| bitvec = { git = "https://github.com/smalis-msft/bitvec", branch = "set-aliased-previous-val" } |
There was a problem hiding this comment.
The [patch.crates-io] override pulls bitvec from a personal GitHub branch. This is a supply-chain/reproducibility risk and can break offline or audited builds. Prefer a crates.io release (once available) or a pinned git revision with a long-lived upstream fork, and document why the patch is required and when it can be removed.
| # Pending <https://github.com/ferrilab/bitvec/pull/273> | |
| bitvec = { git = "https://github.com/smalis-msft/bitvec", branch = "set-aliased-previous-val" } | |
| # Temporary workaround for OpenVMM until the upstream fix from | |
| # <https://github.com/ferrilab/bitvec/pull/273> is released on crates.io. | |
| # Keep this pinned to the exact tested commit to preserve reproducible/auditable | |
| # builds, and remove this patch once a crates.io release includes the fix. | |
| bitvec = { git = "https://github.com/smalis-msft/bitvec", rev = "REPLACE_WITH_TESTED_BITVEC_COMMIT" } |
| description: "Rust language" | ||
| notes: "'version' is the default minimum version used by this project." | ||
| # Keep in sync with rust-toolchain.toml | ||
| version: "1.91" | ||
| version: "1.94" | ||
| meta: | ||
| description: | | ||
| 'newest-version' is the latest version known to work when | ||
| building Kata | ||
| newest-version: "1.91" | ||
| newest-version: "1.94" |
There was a problem hiding this comment.
versions.yaml and rust-toolchain.toml are bumped to Rust 1.94, but the workspace Cargo.toml still declares rust-version = "1.88". If 1.94 is now the project minimum, update rust-version as well; otherwise, consider keeping versions/toolchain aligned with the declared MSRV to avoid confusion and CI drift.
| cargo clippy --all-targets --all-features --release --locked \ | ||
| -- \ | ||
| -D warnings |
There was a problem hiding this comment.
standard_rust_check runs cargo clippy --all-features at the repo root. With the new openvmm feature (and its note about dragonball incompatibility / external path deps), --all-features is likely to select incompatible feature combinations and/or require unavailable local checkouts, breaking the standard check target. Consider switching to an explicit feature matrix (separate clippy runs per supported hypervisor feature) or defining mutually-exclusive features so --all-features remains valid.
| cargo clippy --all-targets --all-features --release --locked \ | |
| -- \ | |
| -D warnings | |
| cargo clippy --all-targets --release --locked \ | |
| -- \ | |
| -D warnings | |
| cargo clippy --all-targets --release --locked --no-default-features --features dragonball \ | |
| -- \ | |
| -D warnings | |
| cargo clippy --all-targets --release --locked --no-default-features --features qemu \ | |
| -- \ | |
| -D warnings | |
| cargo clippy --all-targets --release --locked --no-default-features --features cloud-hypervisor \ | |
| -- \ | |
| -D warnings | |
| cargo clippy --all-targets --release --locked --no-default-features --features firecracker \ | |
| -- \ | |
| -D warnings |
| # This option changes the default hypervisor and kernel parameters | ||
| # to enable debug output where available. | ||
| # | ||
| # Default false | ||
| enable_debug = false | ||
| enable_debug = true | ||
|
|
There was a problem hiding this comment.
This change flips Cloud Hypervisor’s config template defaults to enable_debug = true for hypervisor/agent/runtime. That’s a noisy/behavioral change unrelated to the OpenVMM PoC and may not be appropriate as a new default. Consider reverting these defaults (or gating them behind a separate debug profile) so existing users aren’t surprised by increased logging.
| // Build chipset via VmManifestBuilder | ||
| let chipset = vm_manifest_builder::VmManifestBuilder::new( | ||
| vm_manifest_builder::BaseChipsetType::HyperVGen2LinuxDirect, | ||
| vm_manifest_builder::MachineArch::X86_64, | ||
| ) | ||
| .with_serial(serial_ports) | ||
| .build() | ||
| .context("failed to build VM chipset manifest")?; | ||
|
|
||
| // Memory config | ||
| let mem_size_bytes = (self.config.memory_info.default_memory as u64) | ||
| .checked_mul(1024 * 1024) | ||
| .context("memory size overflow")?; | ||
|
|
||
| // PCIe root complex: ECAM range must match bus count. | ||
| // 128MB ECAM = 128 buses (0..127), each bus has 256 devfns * 4KB config = 1MB. | ||
| let pcie_root_complexes = vec![PcieRootComplexConfig { | ||
| index: 0, | ||
| name: "rc0".to_string(), | ||
| segment: 0, | ||
| start_bus: 0, | ||
| end_bus: 127, | ||
| ecam_range: ovmm_memory_range::MemoryRange::new(0xe800_0000..0xf000_0000), | ||
| low_mmio: ovmm_memory_range::MemoryRange::new(0xc000_0000..0xd400_0000), | ||
| high_mmio: ovmm_memory_range::MemoryRange::new(0x0020_3d30_0000..0x200f_3d30_0000), | ||
| ports: { |
There was a problem hiding this comment.
ARCH_SUPPORT_OPENVMM includes aarch64, but the OpenVMM implementation currently hard-codes x86_64 specifics (e.g. MachineArch::X86_64, DEFAULT_MMIO_GAPS_X86, and x86 MMIO/ECAM ranges). This will break (or misconfigure) OpenVMM on aarch64. Either restrict OpenVMM support to x86_64 in the build logic, or add proper aarch64 chipset/memory/PCIe configuration paths.
| let port = self.reserve_block_hotplug_port(&block.device_id)?; | ||
| let hotplug_result = async { | ||
| let metadata = std::fs::metadata(&block.config.path_on_host).with_context(|| { | ||
| format!( | ||
| "failed to stat block device path {}", | ||
| block.config.path_on_host | ||
| ) | ||
| })?; | ||
|
|
||
| let disk = if metadata.file_type().is_block_device() { | ||
| disk_blockdevice::OpenBlockDeviceConfig { | ||
| file: disk_blockdevice::open_file_for_block( | ||
| std::path::Path::new(&block.config.path_on_host), | ||
| block.config.is_readonly, | ||
| ) | ||
| .with_context(|| { | ||
| format!( | ||
| "failed to open host block device {}", | ||
| block.config.path_on_host | ||
| ) | ||
| })?, | ||
| } | ||
| .into_resource() | ||
| } else { | ||
| let mut options = std::fs::OpenOptions::new(); | ||
| options.read(true); | ||
| if !block.config.is_readonly { | ||
| options.write(true); | ||
| } | ||
|
|
||
| let file = options | ||
| .open(&block.config.path_on_host) | ||
| .with_context(|| { | ||
| format!( | ||
| "failed to open block device path {}", | ||
| block.config.path_on_host | ||
| ) | ||
| })?; | ||
|
|
||
| disk_backend_resources::FileDiskHandle(file).into_resource() | ||
| }; |
There was a problem hiding this comment.
This hotplug path opens a file/blk device in the runtime thread and sends the resulting resource over the mesh RPC. However, the OpenVMM launch path explicitly avoids passing FDs over mesh channels (“FD loss through mesh channel serialization”) by opening the disk in the worker thread. If the same FD-serialization limitation applies here, block hotplug will be unreliable. Consider passing only the path over RPC and opening the FD inside the worker thread (or confirm and document that RPC supports FD transfer safely).
| let Some(port) = self.release_block_hotplug_port(&block.device_id) else { | ||
| warn!( | ||
| sl!(), | ||
| "openvmm: no hotplug mapping found for block device {}", | ||
| block.device_id | ||
| ); | ||
| return Ok(()); | ||
| }; | ||
|
|
||
| self.vmm_instance | ||
| .remove_pcie_device(port.name.clone()) | ||
| .await | ||
| .with_context(|| { | ||
| format!( | ||
| "failed to hot-remove block device {} from {}", | ||
| block.device_id, port.name | ||
| ) | ||
| })?; |
There was a problem hiding this comment.
remove_device() releases the reserved hotplug port mapping before attempting the actual hot-remove RPC. If the RPC fails, the mapping/port reservation is lost and the runtime can get out of sync with the VM state. Consider only releasing the port after a successful remove_pcie_device, or restoring the mapping on failure.
| let hypervisor = config.hypervisor.get(HYPERVISOR_NAME).unwrap(); | ||
| assert!((hypervisor.cpu_info.default_vcpus - (0.75 + 1.2)).abs() < f32::EPSILON); | ||
| assert_eq!(hypervisor.memory_info.default_memory, 2048 + 512); |
There was a problem hiding this comment.
This new test compares floating-point vCPU values using f32::EPSILON, which is often too strict for non-trivial arithmetic like 0.75 + 1.2 and can lead to flaky failures due to rounding. Use a larger tolerance (e.g. 1e-6/1e-4) or an approx/assert helper to make the test robust.
| ARCH_SUPPORT_OPENVMM := x86_64 aarch64 | ||
| ifneq ($(filter $(ARCH),$(ARCH_SUPPORT_OPENVMM)),) | ||
| USE_OPENVMM := true | ||
| else | ||
| USE_OPENVMM := false | ||
| $(info OpenVMM does not support ARCH $(ARCH), disabled. \ | ||
| Specify "USE_OPENVMM=true" to force enable.) | ||
| endif |
There was a problem hiding this comment.
The build system advertises OpenVMM support for aarch64, but the current OpenVMM implementation is x86_64-specific (chipset arch, MMIO gaps/ranges, etc.). To avoid enabling a broken configuration, either remove aarch64 from ARCH_SUPPORT_OPENVMM for now or add the required aarch64-specific OpenVMM configuration paths.
| $(TARGET_PATH): $(SOURCES) | show-summary | ||
| @RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES) | ||
| @RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings --allow deprecated" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES) | ||
|
|
There was a problem hiding this comment.
The build now adds --allow deprecated while still using --deny warnings, which globally suppresses deprecation warnings across the whole runtime build. This can mask important upgrade work and make it harder to notice new deprecations. Prefer fixing the deprecations (or scoping allow(deprecated) to the specific module/uses that require it) and/or only applying this flag when the openvmm feature is enabled.
This avoids issues like below which are now errors in Rust 1.94:
error: a method with this name may be added to the standard library in the future
--> src/libs/kata-sys-util/src/mount.rs:265:12
|
265 | if src.is_empty() {
| ^^^^^^^^
|
= warning: once this associated item is added to the standard library, the ambiguity may cause an error or change in behavior!
= note: for more information, see issue #48919 <rust-lang/rust#48919>
= help: call with fully qualified syntax `nix::NixPath::is_empty(...)` to keep using the current method
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This avoids issues like below which are now errors in Rust 1.94.
error: unnecessary `unsafe` block
--> src/libs/kata-sys-util/src/protection.rs:129:19
|
129 | let fn0 = unsafe { x86_64::__cpuid(0) };
| ^^^^^^ unnecessary `unsafe` block
|
= note: `-D unused-unsafe` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(unused_unsafe)]`
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This is the version used by OpenVMM. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Not functional yet but able to compile OpenVMM in proc. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Add configuration-openvmm.toml.in template for the OpenVMM hypervisor backend, modeled after the Dragonball configuration. Key settings: - Uses inline-virtio-fs (built-in, no external virtiofsd) - Uses virtio-blk-pci for rootfs block device - Uses tcfilter networking model - kernel_params include cgroup_no_v1=all for cgroup v2 Wire the template into the Makefile gated by USE_OPENVMM, following the same pattern as USE_BUILDIN_DB for Dragonball. The generated config file is placed at config/configuration-openvmm.toml. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Add the full set of OpenVMM crate dependencies needed for in-process VM creation: openvmm_core, openvmm_defs, openvmm_resources, openvmm_helpers, vm_manifest_builder, mesh/mesh_worker, pal_async, and device resource crates (storvsp, scsidisk, netvsp, virtio, etc.). All deps are optional and gated behind the 'openvmm' feature flag. Note: dragonball and openvmm features cannot be enabled simultaneously due to kvm-ioctls version conflicts (dragonball uses 0.12, openvmm pulls in 0.14 transitively). When USE_OPENVMM=true, dragonball is excluded from the feature set. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Add vmm_instance.rs wrapping OpenVMM's mesh_worker WorkerHandle for in-process VM lifecycle control (launch, resume, pause, stop). Update inner.rs to use VmmInstance instead of LinuxMshv directly. Fix inner_hypervisor.rs check() to use simple /dev/mshv existence check. Override standard_rust_check in runtime-rs Makefile to use explicit features instead of --all-features (dragonball and openvmm have incompatible kvm-ioctls versions). Exclude dragonball crates from test target when openvmm is enabled. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Implement the full VM boot sequence in inner_hypervisor.rs: - Build kernel cmdline with rootfs, console, and cgroup params - Open kernel file and construct LoadMode::Linux (ACPI boot mode) - Build HyperVGen2LinuxDirect chipset via VmManifestBuilder - Configure PCIe root complex with one port for virtio-blk-pci rootfs - Set up memory, processor topology, and hvsocket for agent comm - Launch VmWorker via mesh_worker and resume (boot) the VM Also implement stop_vm (via VmmInstance::stop), pause_vm, resume_vm, and proper get_agent_socket returning hvsock:// URI. Add memory_range workspace dependency for PCIe MMIO range config. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Fix ECAM range size assertion by setting end_bus=127 to match the 128MB ECAM range (0xe8000000..0xf0000000). The assertion in pcie/ root.rs requires ecam_size_from_bus_numbers == ecam_range.len(). Add 'extern crate openvmm_resources as _' to force the linker to include the VmWorker registration via linkme::distributed_slice. Without this, the worker factory returns 'unsupported worker VmWorker'. Allow deprecated warnings in RUSTFLAGS for upstream openvmm code (storvsp try_next deprecation). The VM now boots successfully in-process via mesh_worker. Pending devices (vsock, network, virtiofs) are not yet wired into the Config. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Process pending devices in start_vm() to include them in the Config: - Vsock: configured via VmbusConfig.vsock_path - Network: create NetvspHandle with TapHandle backend - ShareFs: create VirtioFsHandle on PCIe port rp1 - Block: skip duplicate (already added as rootfs) Bind Unix listener for hvsock at the sandbox path before launching the VM worker. Enable net_tap feature on openvmm_resources. Remove --locked from clippy to allow Cargo.lock updates. Current status: VM boots with all devices, hvsock socket exists, but agent handshake fails due to protocol mismatch between kata's hybrid vsock protocol and OpenVMM's hvsocket relay. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Add initcall_blacklist=virtio_vsock_init to kernel params to prevent the virtio-vsock driver from claiming the vsock transport before hv_sock (needed for Hyper-V socket agent communication). Actually bind the hvsock UnixListener before launching the VM worker so the hvsocket relay can accept connections. Enable net_tap feature on openvmm_resources for TAP network backend. Remove --locked from clippy to allow Cargo.lock updates. Current status: VM boots, all devices wired (rootfs, network, virtio-fs, hvsock), agent client connects to hvsock socket, but hvsocket relay handshake returns empty response (guest VMBus or agent not ready yet). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Restore --locked to cargo clippy in standard_rust_check. Update Cargo.lock to reflect current dependency changes. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Increase reconnect_timeout_ms from 3000 to 45000 in the openvmm config template to give the guest kernel more time to boot and the kata-agent to start. The guest takes ~5-6 seconds to boot to agent. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Add tracing-subscriber dependency for capturing openvmm VmWorker debug output. Write openvmm worker logs to a file in the sandbox run directory (openvmm-worker.log). Pass log_dir to VmmInstance::launch() for tracing setup. Fix launch call site to pass run_dir as log directory. Current status: VM boots, no errors, but hvsock relay doesn't respond to CONNECT requests from in-process integration despite working with CLI. Investigating pal_async event loop integration. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Bind COM1 serial output to a Unix socket at <run_dir>/console.sock. This allows debugging guest kernel output by connecting to the socket: socat -u UNIX-CONNECT:/run/kata/<id>/console.sock - Add serial_socket and tracing-subscriber dependencies. Use VmManifestBuilder::with_serial() to bind the serial port. Note: Serial output is not yet confirmed working - the serial socket resolver may need additional investigation. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
- Open disk file inside VmWorker thread to avoid FD loss through mesh channel serialization (same pattern as hvsock listener) - Use virtio-net-pci instead of VMBus NetvspHandle since guest kernel has CONFIG_VIRTIO_NET=y but CONFIG_HYPERV_NET is not set - Add rootflags=data=ordered,errors=remount-ro ro to kernel cmdline matching CLH behavior for read-only ext4 rootfs - Add HYPERVISOR_NAME_OPENVMM to save/restore match arms in sandbox.rs to fix 'Unsupported hypervisor openvmm' in persist Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
VmWorkerParameters.hypervisor changed from Option<Resource<HypervisorKind>> to Resource<HypervisorKind>. Use MshvHandle.into_resource() to explicitly select the MSHV backend. Added hypervisor_resources as a new dependency. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Replace the VMBus hvsock relay with a virtio-vsock PCIe device (VirtioVsockHandle on rp3) now that OpenVMM supports virtio-vsock. - Remove initcall_blacklist=virtio_vsock_init from kernel params - Remove VMBus vsock_path/vsock_listener config - Bind virtio-vsock UnixListener inside worker thread (FD pattern) - Add VirtioVsockHandle as PCIe device with guest CID 3 - Update get_agent_socket() to use vsock.sock path - The hvsock:// scheme still works — OpenVMM's virtio-vsock uses the same CONNECT/OK handshake on its host listener socket Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Use virtio-console (hvc0) instead of serial (ttyS0) for guest kernel and agent log output. This improves boot performance. - Set enable_serial: false in LoadMode, disable COM1 - Add VirtioConsoleHandle as PCIe device on rp4 - Change kernel console param from console=ttyS0 to console=hvc0 - Same socket-pair-to-journalctl logging pattern as before Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Restore openvmm kernel cmdline assembly to use Kata's shared runtime-rs defaults and rootfs parameter handling so the guest boots the kata-containers.target path instead of diverging from the other VMMs. Also enter the sandbox network namespace in the VM worker before resolving the named tap device so tcfilter networking is set up from the correct namespace during launch. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Teach the OpenVMM runtime-rs integration to reserve PCIe hotplug ports for block devices, defer non-rootfs block devices until after boot, and use the block-device-backed disk handle for host block devices. This captures the Kata-side changes needed for the Kubernetes block-volume test fix without including the local Makefile configuration change.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Queue boot-time virtio-net setup from the outer OpenVMM config path into the worker thread so the TAP device is opened only after the worker enters the sandbox network namespace. OpenVMM now expects a pre-opened TAP fd, so opening the TAP during outer config assembly bypasses the earlier worker-netns fix and can hit Device or resource busy failures in focused Kubernetes tests.
The OpenVMM integration in this tree uses path dependencies into the local ../openvmm-repo checkout. The current OpenVMM branch resolves a slightly different dependency graph, so building runtime-rs refreshes Cargo.lock. Commit the updated lockfile so the tested build stays reproducible without local lockfile regeneration.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This reverts commit a87ddf9.
The runtime-rs shim can be built with the openvmm feature to link the
OpenVMM Rust hypervisor in-process. To deploy this shim end-to-end (as a
RuntimeClass named kata-openvmm with its own configuration-openvmm.toml
and per-arch enablement in the helm chart), kata-deploy needs to
recognize 'openvmm' as a valid shim name.
Four places need the addition:
1. tools/packaging/kata-deploy/binary/src/utils/system.rs
Add 'openvmm' to RUST_SHIMS so kata-deploy resolves the shim binary
under /opt/kata/runtime-rs/bin/containerd-shim-kata-v2 (where the
runtime-rs build drops it) and the per-shim config under
/opt/kata/share/defaults/kata-containers/runtime-rs/.
2. tools/packaging/kata-deploy/binary/src/artifacts/install.rs
- Add 'openvmm' to ALL_SHIMS so install.rs accepts it on the
SHIMS_<ARCH> / DEFAULT_SHIM_<ARCH> environment variables and on
the helm 'shims.openvmm' selector.
- Add 'openvmm => openvmm' to get_hypervisor_name so the per-shim
containerd config is rendered against the [hypervisor.openvmm]
section of configuration-openvmm.toml.
- Gate configure_mariner on the shims_for_arch list containing
'clh'. configure_mariner unconditionally opens
configuration-clh.toml and rewrites
runtime.hypervisor_name = 'clh' plus
hypervisor.clh.{path,valid_hypervisor_paths,...}.
That is the right thing to do when clh is the active shim, but
on an openvmm-only deployment it would clobber
runtime.hypervisor_name = 'openvmm' set by the
configuration-openvmm.toml template. The mariner-specific path
mutations are still applied whenever clh is in the shim list.
3. tools/packaging/kata-deploy/helm-chart/kata-deploy/values.yaml
Add an 'openvmm:' block under shims: so the helm chart can render
the SHIMS_<ARCH> env var and the kata-openvmm RuntimeClass when
the user opts in with --set shims.openvmm.enabled=true.
4. tools/packaging/kata-deploy/helm-chart/kata-deploy/templates/runtimeclasses.yaml
Add 'openvmm' to the $runtimeClassConfigs dict (130Mi/250m,
matching the other Rust-shim VMMs like clh and cloud-hypervisor).
The range loop that emits each RuntimeClass guards every iteration
with '{{ if $config }}', so any shim missing from the dict silently
produces no RuntimeClass. Without this entry, an openvmm-only
install ships kata-deploy with no 'kata-openvmm' (or default 'kata')
RuntimeClass, so kubectl has no runtime to dispatch to.
The configuration-openvmm.toml.in template under src/runtime-rs/config/
already exists and is rendered by the runtime-rs Makefile when
USE_OPENVMM=true. With these four changes, 'openvmm' becomes a
fully-supported shim and no downstream consumer (helm install,
kata-deploy install, nerdctl, kubectl with runtimeClassName=kata-openvmm)
needs to do any post-deploy TOML fixups.
Cold-plug VFIO devices (GPUs, NVSwitches, InfiniBand VFs) are now wired into the OpenVMM Config before launch:
- Reserve 16 PCIe root ports named vfio0..vfio15 with hotplug=false in the root complex.
- In start_vm, process pending DeviceType::Vfio by opening /dev/vfio/<group>, building a VfioDeviceHandle { pci_id, group } per HostDevice in the IOMMU group, and pushing a PcieDeviceConfig on a pre-reserved port.
- Reject post-launch VFIO add_device with an explicit cold-plug-only error.
- Add vfio_assigned_device_resources git dep at the existing openvmm rev.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
`tests/hypervisor_helpers.sh::ALL_HYPERVISORS` lists every hypervisor that the kata test harness accepts via `KATA_HYPERVISOR=...`. It was introduced in upstream commit 2f3fec9 ("tests: Add new hypervisor helper script") and made authoritative by commit ddc3606 ("gha: k8s: reject unsupported KATA_HYPERVISOR values"), which causes `tests/integration/kubernetes/gha-run.sh` to `die` on any value not in the list. The earlier kata-deploy work that registered openvmm as a first-class shim (sprt/openvmm commit 8258d0b "kata-deploy: add 'openvmm' as a first-class shim", which updates `kata-deploy/binary/src/artifacts/ install.rs`, `utils/system.rs`, the helm chart, etc.) missed this file. As a result running KATA_HYPERVISOR=openvmm ./tests/integration/kubernetes/gha-run.sh \ run-tests dies with: Unsupported KATA_HYPERVISOR=openvmm. Supported values: clh clh-runtime-rs dragonball qemu qemu-runtime-rs qemu-nvidia-gpu ... Add `openvmm` to `ALL_HYPERVISORS` so it passes the `is_supported_hypervisor` gate. No new helper category is needed: openvmm is neither TEE, GPU-TEE, SE, CCA, nor firecracker, so the existing `is_*` predicates below correctly return false for it (the gha-run flow already only takes special action for those categories and for `qemu`/`clh-runtime-rs`/`dragonball` explicitly, so a plain openvmm value just continues with default behaviour). Signed-off-by: Jocelyn Berrendonner <Jocelynb@microsoft.com>
Replaces local commit 350d661 "runtime-rs: cold-plug VFIO devices before VM start", layered cleanly on top of sprt/openvmm's existing `prepare_coldplug_cdi_devices` (the upstream Pod Resources API / CDI path from commit 4f618d0 "runtime-rs: Add Pod Resources CDI discovery in sandbox"): Two cold-plug sources are now unioned in `VirtSandbox::start`: 1. `prepare_coldplug_cdi_devices` (already upstream) -- K8s via the kubelet Pod Resources API; works when the vfio-pci device-plugin exposes CDI on the configured `pod_resource_api_sock`. 2. `prepare_coldplug_raw_vfio_devices` (NEW) -- standalone-container path. Reads the OCI spec from the sandbox bundle and cold-plugs any `/dev/vfio/<group>` char device declared in `linux.devices`. Mirrors the Go runtime's `coldOrHotPlugVFIO()` and is shaped to match the upcoming kata-containers/kata-containers PR by Hyounggyu Choi (BbolroC), commit b66713d "runtime-rs: add raw VFIO device cold-plug support", so a future merge with that PR is mechanical rather than architectural. Both helpers return `Vec<ResourceConfig::VfioDeviceModern>` and flow through the same `resource_configs.extend(...)` path that the upstream CDI helper already uses -- no `do_handle_device` direct calls, no new `Sandbox` trait method, no fight with `pending_devices.clear()` in `OpenVmmInner::prepare_vm`. Each helper gates on the `cold_plug_vfio` hypervisor config and only accepts the documented `root-port` mode. Also fold in two correctness fixes that are independent of the architectural change above and are still required against sprt/openvmm: * `inner_hypervisor.rs`: when cold-plugging a VFIO device, OpenVMM needs the *full* PCI BDF including the segment/domain (e.g. `0001:00:00.0`) to resolve `/sys/bus/pci/devices/<full_bdf>`. `HostDevice` splits this into `domain` ("0001") and `bus_slot_func` ("00:00.0"); recombine them and reject entries with an empty domain instead of silently producing a bad path. * `vmm_instance.rs`: switch the openvmm VmWorker tracing subscriber from thread-local (`set_default`) to global (`try_init`), and wire `RUST_LOG` through `EnvFilter`. The thread-local subscriber was invisible from tasks spawned onto `DefaultPool::run_with`'s internal threads, so openvmm-side traces never reached `openvmm-worker.log`. Enable the `env-filter` feature on `tracing-subscriber` for the hypervisor crate. Signed-off-by: Jocelyn Berrendonner <Jocelynb@microsoft.com>
No description provided.