feat: Skill Bank M1+M2 — full Hub skill infrastructure#233
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements a skill resolution and installation mechanism for Scion agents, including URI parsing, secure downloading with transport and size constraints, atomic installation, and configuration merging. Feedback on the changes highlights several key issues: a directory creation oversight that prevents staging the resolution record for container-script harnesses, a file-sharing violation on Windows when deleting open files during download failures, a lack of de-duplication for merged skill configurations leading to installation conflicts, a performance bottleneck from not reusing the HTTP client, and a potential directory traversal bug if the skills directory path is empty.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if info, err := os.Stat(filepath.Dir(inputPath)); err == nil && info.IsDir() { | ||
| _ = os.WriteFile(inputPath, recordData, 0644) | ||
| } |
There was a problem hiding this comment.
At this point in ProvisionAgent, the directory .scion/harness/inputs has not been created yet (it is created later during h.Provision or instruction injection). Because of the os.Stat check on the parent directory, resolved-skills.json will never be written, and container-script harnesses will not receive the resolved skills in their manifest.
We should explicitly create the directory using os.MkdirAll before writing the file to ensure it is correctly staged.
| if info, err := os.Stat(filepath.Dir(inputPath)); err == nil && info.IsDir() { | |
| _ = os.WriteFile(inputPath, recordData, 0644) | |
| } | |
| if err := os.MkdirAll(filepath.Dir(inputPath), 0755); err == nil { | |
| _ = os.WriteFile(inputPath, recordData, 0644) | |
| } |
| f, err := os.Create(destPath) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to create file: %w", err) | ||
| } | ||
| defer f.Close() | ||
|
|
||
| n, err := io.Copy(f, limitedReader) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to write file: %w", err) | ||
| } | ||
| if n > maxSize { | ||
| _ = os.Remove(destPath) | ||
| return fmt.Errorf("file exceeds maximum size of %d bytes", maxSize) | ||
| } |
There was a problem hiding this comment.
On Windows, attempting to delete an open file will fail with a permission/sharing violation error. Because defer f.Close() only runs when the function returns, the os.Remove(destPath) call on line 401 will fail on Windows.
Additionally, if io.Copy fails, we should explicitly close and clean up the partial/corrupted file from disk.
We should explicitly close the file before attempting to remove it.
| f, err := os.Create(destPath) | |
| if err != nil { | |
| return fmt.Errorf("failed to create file: %w", err) | |
| } | |
| defer f.Close() | |
| n, err := io.Copy(f, limitedReader) | |
| if err != nil { | |
| return fmt.Errorf("failed to write file: %w", err) | |
| } | |
| if n > maxSize { | |
| _ = os.Remove(destPath) | |
| return fmt.Errorf("file exceeds maximum size of %d bytes", maxSize) | |
| } | |
| f, err := os.Create(destPath) | |
| if err != nil { | |
| return fmt.Errorf("failed to create file: %w", err) | |
| } | |
| n, err := io.Copy(f, limitedReader) | |
| f.Close() | |
| if err != nil { | |
| _ = os.Remove(destPath) | |
| return fmt.Errorf("failed to write file: %w", err) | |
| } | |
| if n > maxSize { | |
| _ = os.Remove(destPath) | |
| return fmt.Errorf("file exceeds maximum size of %d bytes", maxSize) | |
| } |
| // S6: Detect duplicate destinations | ||
| destMap := make(map[string]string) // destName → URI | ||
| for _, skill := range skills { |
There was a problem hiding this comment.
If the same skill URI is declared in both a base template and an override template, they will be concatenated during MergeScionConfig. When resolving, this results in duplicate ResolvedSkill entries with the exact same URI.
Currently, installResolvedSkills will treat this as a conflict and fail provisioning, or redundantly download and install the same skill multiple times. We should de-duplicate the resolved skills by URI before performing conflict detection and installation.
// De-duplicate skills by URI to avoid redundant downloads/installs
uniqueSkills := make([]ResolvedSkill, 0, len(skills))
seenURIs := make(map[string]bool)
for _, skill := range skills {
if !seenURIs[skill.URI] {
seenURIs[skill.URI] = true
uniqueSkills = append(uniqueSkills, skill)
}
}
skills = uniqueSkills
// S6: Detect duplicate destinations
destMap := make(map[string]string) // destName → URI
for _, skill := range skills {| client := &http.Client{ | ||
| Timeout: downloadTimeout, | ||
| CheckRedirect: func(req *http.Request, via []*http.Request) error { | ||
| // S5: No cross-host redirects | ||
| if len(via) > 0 && req.URL.Host != via[0].URL.Host { | ||
| return fmt.Errorf("cross-host redirect not allowed: %s → %s", via[0].URL.Host, req.URL.Host) | ||
| } | ||
| if len(via) >= 10 { | ||
| return fmt.Errorf("too many redirects") | ||
| } | ||
| return nil | ||
| }, | ||
| } |
There was a problem hiding this comment.
Creating a new http.Client on every call to downloadSkillFile prevents TCP connection reuse and connection pooling. If a skill contains multiple files, this will force a new TCP connection and TLS handshake for every single file, significantly slowing down downloads.
We should reuse a single shared http.Client instance (e.g., as a package-level variable) since http.Client is safe for concurrent use.
Please define sharedDownloadClient as a package-level variable to enable connection pooling:
var sharedDownloadClient = &http.Client{
Timeout: downloadTimeout,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
if len(via) > 0 && req.URL.Host != via[0].URL.Host {
return fmt.Errorf("cross-host redirect not allowed: %s → %s", via[0].URL.Host, req.URL.Host)
}
if len(via) >= 10 {
return fmt.Errorf("too many redirects")
}
return nil
},
} client := sharedDownloadClient| func enumerateLocalSkills(agentHome, skillsDir string) []SkillResolutionEntry { | ||
| skillsPath := filepath.Join(agentHome, skillsDir) |
There was a problem hiding this comment.
If skillsDir is empty, filepath.Join(agentHome, skillsDir) will resolve to agentHome. This would cause enumerateLocalSkills to read the entire agent home directory and incorrectly treat all its subdirectories (like .scion or workspace) as local skills.
We should add a defensive check to return nil immediately if skillsDir is empty.
func enumerateLocalSkills(agentHome, skillsDir string) []SkillResolutionEntry {
if skillsDir == "" {
return nil
}
skillsPath := filepath.Join(agentHome, skillsDir)Add the foundational types for skill bank provisioning: - SkillReference struct in pkg/api/types.go with URI, As, and Optional fields - Skills field on ScionConfig for declaring skill dependencies in templates - ParseSkillURI parser supporting full, shorthand, alias, and bare name forms - ValidateSkillName for kebab-case name validation (1-64 chars) - MergeScionConfig extension to append Skills slices - Comprehensive table-driven tests for all valid/invalid URI forms
Implement the resolver framework and all safety contracts: - SkillResolver interface with batch resolve, per-skill errors - Context injection via ContextWithSkillResolver/SkillResolverFromContext - installResolvedSkills: staging, per-file hash verification (S2), path safety validation (S3), atomic rename install - downloadSkillFile: HTTPS-only (S5), 30s timeout, 10MB limit, cross-host redirect rejection, signed URL secrecy - Duplicate destination detection (S6) before installation - SkillResolutionRecord types for audit/diagnostic record (S4) - Helper functions: enumerateLocalSkills, collectRequiredSkillURIs - Comprehensive tests with httptest mock server
Wire the SkillResolver into ProvisionAgent as step 3d: - Fail-closed (S1): required skills with no resolver → provisioning error listing URIs and a hint to connect Hub or mark optional - Optional skills with no resolver → debug log and continue - Resolve via context-injected SkillResolver, handle per-skill errors - Install resolved skills via installResolvedSkills (staging + verify) - Write resolution record to agentHome/.scion/resolved-skills.json (S4) - Stage resolved-skills.json as container-script harness input - Add ResolvedSkills field to ProvisionInputs manifest - Add manifest reflection for resolved-skills.json in Provision() - Integration tests: mock resolver success, required/no resolver failure, optional/no resolver success - All existing provision tests pass unchanged (no regression)
Additional test coverage for the skill bank provisioning framework: - Bundle hash mismatch detection (S2 bundle-level verification) - enumerateLocalSkills: filters hidden dirs and files - enumerateLocalSkills: handles non-existent directory - Override existing local skill with resolved skill - Empty skills list handling - YAML skills field parsing (verifies scion-agent.yaml roundtrip) - Resolver per-skill error on required skill → provisioning failure - All existing provisioning tests continue to pass
…orm#399 - Add guard in provision.go when skills are resolved but harness has no skills directory configured - Fix @ version split in skill_uri.go to only match after the last path separator (prevents confusion with URI authority credentials) - Close file handle before os.Remove in skill_resolver.go size-exceeded path (Windows compatibility) - Use filepath.ToSlash() on InstalledPath values in resolved-skills.json for Linux container compatibility
…skills Introduces the database layer for the skill bank feature: - Ent schemas for Skill and SkillVersion entities with appropriate indexes and enum constraints - Store models (Skill, SkillVersion, SkillFilter) and SkillStore interface with CRUD operations and semver-aware version resolution - Full Ent adapter implementation including ResolveSkillVersion with support for exact match, constraint ranges (^, ~), content-addressed lookup (sha256:), and "latest" resolution - Storage helpers (SkillStoragePath, SkillStorageURI) for skill content - CompositeStore integration embedding the new SkillStore
…ution
Implements the HTTP API layer for skills:
- CRUD endpoints: GET/POST /api/v1/skills, GET/PATCH/DELETE /api/v1/skills/{id}
- Version management: GET/POST /api/v1/skills/{id}/versions
- Upload/finalize workflow matching template pattern with signed URLs
- Batch resolve: POST /api/v1/skills/resolve with scope search order
(user > project > global > core) and semver constraint matching
- Single-skill resolve and download endpoints
- Publication validation: SKILL.md required, size limits, version immutability
- Capability and scope action registration for "skill" resource type
- Route registration in server.go
Adds SkillService interface and implementation to hubclient: - CRUD: List, Get, Create, Update, Delete - Version management: PublishVersion, ListVersions, FinalizeVersion - File transfer: RequestUploadURLs, UploadFile, DownloadFile - Batch resolution: Resolve - Client-side types mirroring Hub API request/response shapes - Skills() accessor added to Client interface
Adds `scion skills` command group with subcommands: - list: List skills with scope and search filtering - show: Show skill details and version history - create: Scaffold a local skill directory with SKILL.md template - publish: Upload and publish a skill directory to Hub with version - delete: Soft-delete a skill - versions: List versions for a skill - resolve: Debug/test skill URI resolution Includes `scion skill` singular alias and --format json support for all commands. Publish workflow: collect files, validate limits, create/find skill, create draft version, upload via signed URLs, finalize with content hash computation.
Adds comprehensive tests for the SkillStore implementation: - CRUD operations (create, get, get-by-slug, update, delete) - Soft delete via status=archived - List with scope, name, and search filters - Version CRUD and version immutability (unique index enforcement) - Semver resolution: latest, exact, constraint (^, ~, >=), content hash - Pre-release exclusion from latest/constraint resolution - Draft version exclusion from resolution - Unique slug per scope enforcement Also fixes mockHubClient in templatecache tests to implement the new Skills() method on the hubclient.Client interface.
- H1: Add authorization checks (CheckAccess) on all skill write endpoints (create, update, delete, publish, upload, finalize) following the template_handlers.go pattern with scope-aware checks for global, project, and user-scoped skills. - M1: Thread projectID into determineScopeSearchOrder so project- scoped skills are found during default batch resolve search order (user > project > global > core). - M3: Always filter on scope_id in GetSkillBySlug (including empty string for global/core) to prevent cross-tenant data leaks. Also filter on status=active to exclude archived skills from resolution. - L2: Replace deprecated strings.Title with custom skillDisplayName. - L5: Use sync.Once for thread-safe transfer client initialization. - L6: Distinguish skill-found-but-version-not-resolved from skill- not-found in resolveSkill error messages. - L7: Validate scope field against allowed values in createSkill.
Implement HubSkillResolver that bridges hubclient.SkillService to the agent.SkillResolver interface. The resolver maps Hub response types (hubclient.ResolvedSkill, DownloadURLInfo) to agent-internal types (agent.ResolvedSkill, ResolvedFile), preserving As from the original SkillReference (not from the Hub response). Includes unit tests covering type mapping, As/Optional preservation, per-skill errors, transport errors, multiple skills, and empty refs.
- Add ContextWithResolveProjectID/UserID context injection functions
in skill_resolver.go for identity propagation.
- Update provision.go to populate ResolveOpts.ProjectID from context
(falling back to config.ReadProjectID) and UserID from context.
- Inject HubSkillResolver in cmd/create.go local mode when a Hub
client is available (lightweight EnsureHubReady with SkipSync).
- Inject HubSkillResolver in broker handlers.go from the Hub
connection before both Provision and Start paths.
- Add ResolverNamer interface so the resolution record captures the
actual resolver name ("hub") instead of hardcoded "mock".
H1 fix: In provision.go error iteration, when findRefByURI returns nil (unmatched URI), the condition `ref != nil && !ref.Optional` was false, silently skipping the error as optional. Changed to `ref == nil || !ref.Optional` so unmatched URIs are treated as required (fail-closed). M1 fix: Wire ContextWithResolveUserID in broker handlers.go from req.UserID. Local CLI mode has no user identity source on the Hub client, so UserID remains empty there (acceptable — scoped resolution is optional).
Add content-hash-keyed caching for resolved skills so the broker doesn't re-download identical skill files on every provision. Reuses the existing templatecache.Cache infrastructure with a separate cache instance (500MB default) under ~/.scion/cache/skills/. - CachingSkillResolver decorator wraps HubSkillResolver, injects cache into context, and logs cache hit/miss per resolved skill - installOneSkill checks cache before downloading (copy on hit) and populates cache after successful download+verify - Broker server initializes skCache in initHubIntegration() alongside the template and harness-config caches - Handler wiring wraps resolver with CachingSkillResolver when cache is available - latest/range constraints always re-resolve against the Hub but skip download when the content hash matches a cached entry
Wire SkillFilter.Tags in the store adapter with AND semantics (all tags must match). Extend search to also match against the tags JSON field. Parse tags query param in the list handler, add Tags to hub client ListSkillsOptions, and add --tags flag to CLI list command.
Add deprecateSkillVersion handler (POST .../versions/{vid}/deprecate)
that transitions a published version to deprecated with a message and
optional replacement URI. Add DeprecateVersion to hub client and
'scion skills deprecate' CLI command.
Extend ResolvedSkillResponse, hubclient.ResolvedSkill, and agent ResolvedSkill/SkillResolutionEntry with deprecation fields. Populate deprecation metadata in handleSkillsResolve. Map fields through the hub skill resolver. Emit deprecation warnings via util.Warnf during skill installation. Update ResolveSkillVersion to include deprecated versions so they still resolve (warnings, not errors).
Add download_count field to SkillVersion ent schema and regenerate. Add DownloadCount to store model, IncrementSkillVersionDownloadCount to store interface and ent adapter (atomic AddDownloadCount(1)). Fire-and- forget increment in handleSkillsResolve goroutine. Display download count per version in CLI show output.
Add tags column to skill list output. Fix deprecation warning to use fmt.Fprintln(os.Stderr) instead of nonexistent util.Warnf. Add DeprecateVersion to mock SkillService to satisfy updated interface.
M6-M1: latest and constraint resolution now prefer published versions
over deprecated. Pinned exact-version lookups still return deprecated
versions (correct: caller asked for a specific version and gets a
warning). Fallback to deprecated only when no published version matches.
M6-M2: Tag filtering wraps tag values in JSON quotes ("tag") to avoid
substring false positives (e.g. filtering by "review" no longer matches
skills tagged "preview").
M6-L1: Add server-side validation that deprecation message is non-empty. M6-L4: Use TagsContainsFold (case-insensitive) for tag filtering to match the case-insensitive behavior of search.
After resolver.Resolve(), validate that every requested skill URI appears in either the resolved or error sets, reject unrequested/duplicate skills. Fix HubSkillResolver to skip unknown URIs instead of silently accepting zero-value refs. Enumerate local skills before installing registry skills to prevent M2 resolution record duplication.
Add authorization checks to skill read endpoints that were missing them: - listSkills: filter out skills where ActionRead is denied - getSkill: check ActionRead before returning, 404 on denial - handleSkillDownload: check ActionRead before generating URLs - handleSkillsResolve: check ActionRead per resolved skill - handleSkillResolveSingle: check ActionRead before resolving Fix user scope authorization gap in createSkill: the user scope fell through with no check, allowing arbitrary scopeId. Now enforces scopeId = authenticated user's ID. Add batch resolve item cap (max 50) to prevent abuse. Add tests for all authorization behaviors.
Add cache hit hash verification in installOneSkill: after CopyToDir, verify each file's SHA-256 matches the expected hash. On mismatch, remove the cached copy and fall through to download. Add nil cache guard to NewCachingSkillResolver: panic early if cache is nil to prevent subtle nil-pointer bugs downstream.
…sion These two read endpoints were missing ActionRead checks. Both now fetch the parent skill, check ActionRead, and return 404 on denial, matching the pattern used by all other skill read endpoints.
Summary
Implements Milestone 1 of the Skill Bank feature: the provisioning-side framework that enables templates to declare
skills:references inscion-agent.yaml. Skills are resolved via aSkillResolverinterface, downloaded, verified, and atomically installed into the agent's skill directory.This milestone is tested with a mock resolver — real Hub integration comes in M3.
Commits
SkillReferencestruct inpkg/api/types.go,ParseSkillURIparser supporting full/shorthand/alias/bare name forms,ValidateSkillName(kebab-case, 1-64 chars),MergeScionConfigSkills appendSkillResolverinterface with batch resolve and per-skill errors, context injection (ContextWithSkillResolver),installResolvedSkillswith staging + hash verification + atomic rename,downloadSkillFilewith transport constraintsprovision.go, fail-closed logic, resolution record writing, container-script manifest extension (ResolvedSkillsinProvisionInputs)enumerateLocalSkills, resolver error propagation testsSafety Contracts Enforced
transfer.HashFile(), bundle hash viatransfer.ComputeContentHash().., no abs, no backslashes, no NUL, no reserved names), atomic renameagentHome/.scion/resolved-skills.jsonwith exact versions, hashes, file manifestsFiles Changed
pkg/api/types.goSkills []SkillReferencetoScionConfig; addSkillReferencestructpkg/api/skill_uri.go(new)ParseSkillURI,ValidateSkillName,SkillURItypepkg/api/skill_uri_test.go(new)pkg/agent/skill_resolver.go(new)SkillResolverinterface, resolved types, context injection, download/stage/verify/install logicpkg/agent/skill_resolver_test.go(new)pkg/config/templates.goMergeScionConfigSkills appendpkg/config/templates_test.gopkg/agent/provision.gopkg/agent/provision_test.gopkg/harness/container_script_harness.goResolvedSkillsinProvisionInputs, manifest reflectionTest Plan
httptest.NewTLSServer, hash verification pass/fail../, absolute paths, backslashes, NUL bytes, reserved namesskills:field parsing roundtripTestProvisionAgent*tests pass unchanged (no regression)