Skip to content

Add Azure Container Apps deployment plan for simplified architecture#398

Draft
itlackey wants to merge 5 commits into
mainfrom
claude/simplify-azure-deployment-nRanK
Draft

Add Azure Container Apps deployment plan for simplified architecture#398
itlackey wants to merge 5 commits into
mainfrom
claude/simplify-azure-deployment-nRanK

Conversation

@itlackey
Copy link
Copy Markdown
Owner

@itlackey itlackey commented May 7, 2026

Summary

This adds a comprehensive deployment plan for running OpenPalm on Azure Container Apps (ACA) with a simplified architecture that removes the memory and scheduler containers and deploys only the assistant and guardian services.

Key Changes

  • New deployment plan document (deploy/azure/PLAN.md): Detailed specification covering:

    • Simplified two-container architecture (assistant + guardian only)
    • SQLite database durability strategy using hourly backups to Azure Files with 7-day retention
    • Volume layout separating POSIX-safe paths (emptyDir for SQLite) from SMB-safe paths (Azure Files shares)
    • IP-restricted external ingress on the assistant's OpenCode webserver
    • Cron-based backup scheduling instead of a dedicated scheduler container
  • Required image changes: Documents additions to the Dockerfile:

    • Install cron and sqlite3 packages
    • Copy backup script (akm-backup.sh) and cron job definition (cron.d/akm-backup)
  • Required entrypoint changes: Documents two new shell functions:

    • maybe_restore_akm_db(): Restores the AKM SQLite database from the latest backup snapshot on cold start
    • start_cron(): Starts the system cron daemon and writes environment variables for scheduled jobs
  • Azure resource architecture: Specifies resource group, ACA environment, container apps, storage account, Key Vault, and managed identity configuration

  • Deployment script specification: Outlines deploy-aca.sh with subcommands for setup, deploy, update-ips, status, and teardown

Notable Implementation Details

  • SQLite on emptyDir: The AKM database lives on an ACA emptyDir volume (local POSIX-compliant filesystem) rather than Azure Files, because SMB does not support the advisory file locks that SQLite requires for write serialization and WAL coordination
  • Backup/restore cycle: Provides durability despite the ephemeral emptyDir; configurable backup interval (default 1 hour) with automatic restore on container startup
  • Cron environment isolation: Cron jobs do not inherit the container environment, so the script writes required variables to /etc/cron-env before starting the cron daemon
  • Single replica enforcement: The architecture requires exactly one replica due to per-replica emptyDir state; documented in YAML constraints
  • Granular volume mounts: Replaces a monolithic home directory share with targeted mounts for config, shared data, artifacts, and backups, reducing SMB lock contention

This is a planning document only; actual implementation of the Dockerfile, entrypoint, and deployment script changes will follow in subsequent PRs.

https://claude.ai/code/session_01MvxXjvN39TFkakcVohckTC

claude added 5 commits May 7, 2026 02:23
Replaces the broader issue-315 scope with a two-container deployment
(assistant + guardian only) that exposes the OpenCode web server as the
sole ACA ingress behind an IP allowlist, adds an hourly AKM backup job
with 7-day rolling retention on Azure Files, and removes the VM, memory,
scheduler, and channel-chat containers entirely.

https://claude.ai/code/session_01MvxXjvN39TFkakcVohckTC
The AKM database lives in $HOME/.local/state, not the stash directory.
Replace the ACA scheduled job approach with a sidecar container inside
the assistant app so the backup has direct filesystem access and can use
sqlite3's .backup command (online backup API) instead of a raw file copy,
which is unsafe against a live SQLite database.

https://claude.ai/code/session_01MvxXjvN39TFkakcVohckTC
Azure Files SMB does not implement POSIX advisory locks, so mounting
/home/opencode to an SMB share breaks SQLite. Replace the monolithic
home share with:
- granular Azure Files mounts for non-SQLite subdirectories (.config,
  .local/share, .akm, /work)
- an emptyDir volume for /home/opencode/.local/state (where the AKM db
  lives) shared between the opencode and akm-backup sidecar containers

Add restore-on-start logic to entrypoint.sh that seeds the emptyDir from
the latest Azure Files backup snapshot on cold start, making the ephemeral
local db durable across restarts at the cost of up to one backup interval
of data loss on crash.

https://claude.ai/code/session_01MvxXjvN39TFkakcVohckTC
No second container needed. start_backup_loop() spawns a background
subshell before the exec into opencode; tini (PID 1) adopts the orphan
and it runs for the full container lifetime. Requires adding sqlite3 to
the apt-get install line in the assistant Dockerfile.

https://claude.ai/code/session_01MvxXjvN39TFkakcVohckTC
The scheduler container is dropped so the assistant needs cron anyway.
Replace start_backup_loop with start_cron: installs cron + sqlite3 in
the Dockerfile, bakes in akm-backup.sh and cron.d/akm-backup, writes
/etc/cron-env from entrypoint for env passthrough, starts the cron
daemon before exec. Cron job logs to /proc/1/fd/1 so output appears in
ACA log streams.

https://claude.ai/code/session_01MvxXjvN39TFkakcVohckTC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants