fix(runtime): replace bind-mount secret staging with env-var pipeline for stateless brokers#523
Conversation
…#284) Replace host-filesystem bind-mount secret staging with a stateless env-var-based pipeline. The broker serializes file and variable secrets into a single base64-encoded JSON blob (SCION_STAGED_SECRETS) injected via -e flag. sciontool init decodes and writes secrets to their target paths before any child process starts. This eliminates the broker's dependency on local storage for secrets, enabling stateless and HA broker deployments. Phase 1: file + variable secrets. Auth file staging (applyResolvedAuth) is deferred to Phase 2.
M-1: serializeSecrets now truly deduplicates file secrets by target path (last entry wins), matching old writeFileSecrets behavior. M-2: Add TODO(#284) to insertVolumeFlags noting it can be removed once Phase 2 eliminates remaining bind-mount usage.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Code Review
This pull request implements Phase 1 of the stateless broker secret staging design, replacing host-filesystem bind-mounts with a single base64-encoded environment variable (SCION_STAGED_SECRETS) that is decoded and written inside the container by sciontool init. Feedback on these changes highlights a critical permission and ownership issue where files created by WriteStagedSecrets as root are unreadable by the non-root agent user; a solution is suggested to chown files using host UID/GID environment variables. Additionally, an improvement is recommended to make serialization more robust by decoding and re-encoding secrets on the host side to ensure a uniform base64 payload.
| func WriteStagedSecrets(homeDir string, staged *StagedSecrets) error { | ||
| for _, fs := range staged.FileSecrets { | ||
| data, err := base64.StdEncoding.DecodeString(fs.Value) | ||
| if err != nil { | ||
| data = []byte(fs.Value) | ||
| } | ||
| seen[target] = true | ||
| entries = append(entries, secretMapEntry{ | ||
| Name: s.Name, | ||
| Target: target, | ||
| Source: s.Name, // filename within secrets/ volume | ||
| }) | ||
| } | ||
|
|
||
| if len(entries) == 0 { | ||
| return nil | ||
| } | ||
|
|
||
| secretsDir := filepath.Join(filepath.Dir(homeDir), "secrets") | ||
| if err := os.MkdirAll(secretsDir, 0700); err != nil { | ||
| return fmt.Errorf("failed to create secrets directory: %w", err) | ||
| if err := os.MkdirAll(filepath.Dir(fs.Target), 0755); err != nil { | ||
| return fmt.Errorf("failed to create directory for secret %s: %w", fs.Name, err) | ||
| } | ||
| if err := os.WriteFile(fs.Target, data, 0600); err != nil { | ||
| return fmt.Errorf("failed to write secret file %s: %w", fs.Name, err) | ||
| } | ||
| } | ||
|
|
||
| data, err := json.Marshal(entries) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to marshal secret-map.json: %w", err) | ||
| if len(staged.VariableSecrets) > 0 { | ||
| scionDir := filepath.Join(homeDir, ".scion") | ||
| if err := os.MkdirAll(scionDir, 0700); err != nil { | ||
| return fmt.Errorf("failed to create .scion directory: %w", err) | ||
| } | ||
| data, err := json.Marshal(staged.VariableSecrets) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to marshal secrets.json: %w", err) | ||
| } | ||
| if err := os.WriteFile(filepath.Join(scionDir, "secrets.json"), data, 0600); err != nil { | ||
| return fmt.Errorf("failed to write secrets.json: %w", err) | ||
| } | ||
| } | ||
|
|
||
| return os.WriteFile(filepath.Join(secretsDir, "secret-map.json"), data, 0600) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Critical Permission & Ownership Bug
When sciontool init runs as root (PID 1), any directories and files created by WriteStagedSecrets will be owned by root:root with 0600 permissions. This prevents the non-root scion user (which runs the actual agent/harness process) from reading the staged secrets, causing immediate permission errors.
Solution
- Detect if we are running as root (
os.Getuid() == 0). - Retrieve the target host UID/GID from the
SCION_HOST_UIDandSCION_HOST_GIDenvironment variables (which are already injected into the container). - Automatically
chownthe created files and directories (only if they are under the agent's home directory to avoid modifying system directories like/etc/ssl). - Simplify the decoding logic to strictly decode base64 without any fallbacks, matching the host-side serialization improvement.
func WriteStagedSecrets(homeDir string, staged *StagedSecrets) error {
var uid, gid int
if os.Getuid() == 0 {
if uidStr := os.Getenv("SCION_HOST_UID"); uidStr != "" {
if id, err := strconv.Atoi(uidStr); err == nil {
uid = id
}
}
if gidStr := os.Getenv("SCION_HOST_GID"); gidStr != "" {
if id, err := strconv.Atoi(gidStr); err == nil {
gid = id
}
}
}
for _, fs := range staged.FileSecrets {
data, err := base64.StdEncoding.DecodeString(fs.Value)
if err != nil {
return fmt.Errorf("failed to decode secret %s: %w", fs.Name, err)
}
dir := filepath.Dir(fs.Target)
if err := os.MkdirAll(dir, 0755); err != nil {
return fmt.Errorf("failed to create directory for secret %s: %w", fs.Name, err)
}
if (uid > 0 || gid > 0) && strings.HasPrefix(dir, homeDir) {
_ = os.Chown(dir, uid, gid)
}
if err := os.WriteFile(fs.Target, data, 0600); err != nil {
return fmt.Errorf("failed to write secret file %s: %w", fs.Name, err)
}
if uid > 0 || gid > 0 {
_ = os.Chown(fs.Target, uid, gid)
}
}
if len(staged.VariableSecrets) > 0 {
scionDir := filepath.Join(homeDir, ".scion")
if err := os.MkdirAll(scionDir, 0700); err != nil {
return fmt.Errorf("failed to create .scion directory: %w", err)
}
if uid > 0 || gid > 0 {
_ = os.Chown(scionDir, uid, gid)
}
data, err := json.Marshal(staged.VariableSecrets)
if err != nil {
return fmt.Errorf("failed to marshal secrets.json: %w", err)
}
secretsPath := filepath.Join(scionDir, "secrets.json")
if err := os.WriteFile(secretsPath, data, 0600); err != nil {
return fmt.Errorf("failed to write secrets.json: %w", err)
}
if uid > 0 || gid > 0 {
_ = os.Chown(secretsPath, uid, gid)
}
}
return nil
}| // Re-encode raw values as base64 for uniform handling on the container side. | ||
| val := s.Value | ||
| if _, err := base64.StdEncoding.DecodeString(val); err != nil { | ||
| val = base64.StdEncoding.EncodeToString([]byte(val)) | ||
| } | ||
|
|
||
| // Write to staging dir using the secret name as filename | ||
| hostPath := filepath.Join(secretsDir, s.Name) | ||
| if err := os.MkdirAll(filepath.Dir(hostPath), 0700); err != nil { | ||
| return nil, fmt.Errorf("failed to create secret directory: %w", err) | ||
| } | ||
| if err := os.WriteFile(hostPath, data, 0600); err != nil { | ||
| return nil, fmt.Errorf("failed to write secret file %s: %w", s.Name, err) | ||
| entry := StagedFileSecret{ | ||
| Name: s.Name, | ||
| Target: target, | ||
| Value: val, | ||
| } |
There was a problem hiding this comment.
To make the secret staging pipeline fully deterministic and robust, we should decode any base64-encoded secrets on the host side during serialization, and then always encode the resulting raw bytes to base64. This ensures that the container side receives a uniform, strictly base64-encoded payload and does not need to guess or fall back when decoding.
| // Re-encode raw values as base64 for uniform handling on the container side. | |
| val := s.Value | |
| if _, err := base64.StdEncoding.DecodeString(val); err != nil { | |
| val = base64.StdEncoding.EncodeToString([]byte(val)) | |
| } | |
| // Write to staging dir using the secret name as filename | |
| hostPath := filepath.Join(secretsDir, s.Name) | |
| if err := os.MkdirAll(filepath.Dir(hostPath), 0700); err != nil { | |
| return nil, fmt.Errorf("failed to create secret directory: %w", err) | |
| } | |
| if err := os.WriteFile(hostPath, data, 0600); err != nil { | |
| return nil, fmt.Errorf("failed to write secret file %s: %w", s.Name, err) | |
| entry := StagedFileSecret{ | |
| Name: s.Name, | |
| Target: target, | |
| Value: val, | |
| } | |
| var data []byte | |
| if decoded, err := base64.StdEncoding.DecodeString(s.Value); err == nil { | |
| data = decoded | |
| } else { | |
| data = []byte(s.Value) | |
| } | |
| entry := StagedFileSecret{ | |
| Name: s.Name, | |
| Target: target, | |
| Value: base64.StdEncoding.EncodeToString(data), | |
| } |
…teStagedSecrets Address Gemini review findings on PR GoogleCloudPlatform#523: 1. serializeSecrets now normalizes base64 deterministically by always decoding first (to get raw bytes) then re-encoding, ensuring uniform base64 payload regardless of input format. 2. WriteStagedSecrets now uses strict base64 decoding (no fallback) since the host side guarantees base64-encoded values. 3. When running as root (PID 1 in containers), WriteStagedSecrets reads SCION_HOST_UID/GID env vars and chowns created directories and files under homeDir so the non-root scion user can read them.
Fixes #284: Secret staging previously relied on bind-mounting files into agent containers, which is incompatible with stateless broker deployments (Cloud Run, remote brokers). Replaces with an environment-variable pipeline that works in all deployment modes.