Skip to content

feat: Knowledge Extraction — Phase 1 spec + workflow#31

Open
khaliqgant wants to merge 1 commit intomainfrom
feat/knowledge-extraction
Open

feat: Knowledge Extraction — Phase 1 spec + workflow#31
khaliqgant wants to merge 1 commit intomainfrom
feat/knowledge-extraction

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Mar 30, 2026

Knowledge Extraction — Phase 1

Problem

Multiple agents working on the same codebase each start cold. Agent 3 discovers conventions (Vitest, prices in cents, shamefully-hoist). Agent 7 starts 20 minutes later with none of this context.

Solution

Path-scoped knowledge annotations that agents write and query through the relayfile API. The relay broker can later inject relevant knowledge into task preambles at dispatch time.

What's in this PR

  1. Updated spec (docs/knowledge-extraction-spec.md) — rewritten for Go server architecture, focused on Phase 1
  2. Workflow (workflows/knowledge-extraction.ts) — agent-relay workflow to implement Phase 1

Phase 1 scope (what the workflow builds)

  • Go server: knowledge_annotations table + 6 CRUD endpoints
  • TypeScript SDK: writeKnowledge, queryKnowledge, confirmKnowledge, supersedeKnowledge, deleteKnowledge
  • Python SDK: matching methods
  • OpenAPI spec updates
  • Tests

What changed from old PR #14

  • Removed unrelated packages/relayfile bin/runner change (separate concern)
  • Updated storage model for Go server (was D1/KV, now memory/postgres backends)
  • Workflow rewritten following agent-relay best practices:
    • require() not import (CJS)
    • Deterministic verify gates between impl and review
    • Output constraints on every step
    • DAG parallelism (server + SDK build concurrently)
    • Lead+worker pattern with channel communication
    • onError: 'continue' for resilience
  • Spec scoped to Phase 1 only (storage + API + SDK). Phases 2-4 (auto-extraction, trajectory mining, broker injection) documented but deferred.

Closes #14 (supersedes)


Open with Devin

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment on lines +162 to +178
command: [
`cd ${RELAYFILE}`,
// Check Go server compiles
`echo "=== Go build ===" && make build 2>&1 | tail -3`,
// Check Go tests pass
`echo "=== Go test ===" && go test ./internal/... 2>&1 | tail -10`,
// Check TS SDK builds and tests
`echo "=== TS SDK build ===" && cd packages/sdk/typescript && npm run build 2>&1 | tail -3`,
`echo "=== TS SDK test ===" && npm test 2>&1 | tail -10`,
// Check critical files exist
`echo "=== File check ==="`,
`cd ${RELAYFILE}`,
`missing=0`,
`for f in internal/httpapi/knowledge.go packages/sdk/typescript/src/types.ts; do if [ ! -f "$f" ]; then echo "MISSING: $f"; missing=$((missing+1)); fi; done`,
`if [ $missing -gt 0 ]; then echo "$missing files missing"; exit 1; fi`,
`echo "All files present, build passed"`,
].join(' && '),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Piping build/test commands to tail masks their exit codes, making failOnError: true ineffective

In the verify-build step, commands like make build 2>&1 | tail -3 and go test ./internal/... 2>&1 | tail -10 pipe their output through tail. In bash, the exit code of a pipeline is the exit code of the last command (unless set -o pipefail is enabled, which it isn't). Since tail almost always exits 0 regardless of the upstream command's exit code, build and test failures are silently swallowed. This means the failOnError: true on line 180 can never actually trigger on a build/test failure — the step will always pass, and downstream steps (update-openapi, commit, review) will proceed even with broken builds.

Affected commands (lines 165-170)
  • make build 2>&1 | tail -3 — Go build failure masked
  • go test ./internal/... 2>&1 | tail -10 — Go test failure masked
  • npm run build 2>&1 | tail -3 — TS SDK build failure masked
  • npm test 2>&1 | tail -10 — TS SDK test failure masked

Notably, the existing relayfile-bulk-and-export.ts:352-360 uses a similar | tail pattern but deliberately sets failOnError: false and manually echoes the exit code with ; echo "EXIT: $?" — confirming the pipe masking is a known concern in this repo.

Suggested change
command: [
`cd ${RELAYFILE}`,
// Check Go server compiles
`echo "=== Go build ===" && make build 2>&1 | tail -3`,
// Check Go tests pass
`echo "=== Go test ===" && go test ./internal/... 2>&1 | tail -10`,
// Check TS SDK builds and tests
`echo "=== TS SDK build ===" && cd packages/sdk/typescript && npm run build 2>&1 | tail -3`,
`echo "=== TS SDK test ===" && npm test 2>&1 | tail -10`,
// Check critical files exist
`echo "=== File check ==="`,
`cd ${RELAYFILE}`,
`missing=0`,
`for f in internal/httpapi/knowledge.go packages/sdk/typescript/src/types.ts; do if [ ! -f "$f" ]; then echo "MISSING: $f"; missing=$((missing+1)); fi; done`,
`if [ $missing -gt 0 ]; then echo "$missing files missing"; exit 1; fi`,
`echo "All files present, build passed"`,
].join(' && '),
command: [
`cd ${RELAYFILE}`,
// Check Go server compiles
`echo "=== Go build ==="`,
`make build 2>&1; build_exit=$?; if [ $build_exit -ne 0 ]; then exit $build_exit; fi`,
// Check Go tests pass
`echo "=== Go test ==="`,
`go test ./internal/... 2>&1; test_exit=$?; if [ $test_exit -ne 0 ]; then exit $test_exit; fi`,
// Check TS SDK builds and tests
`echo "=== TS SDK build ==="`,
`cd packages/sdk/typescript && npm run build 2>&1; sdk_build=$?; if [ $sdk_build -ne 0 ]; then exit $sdk_build; fi`,
`echo "=== TS SDK test ==="`,
`npm test 2>&1; sdk_test=$?; if [ $sdk_test -ne 0 ]; then exit $sdk_test; fi`,
// Check critical files exist
`echo "=== File check ==="`,
`cd ${RELAYFILE}`,
`missing=0`,
`for f in internal/httpapi/knowledge.go packages/sdk/typescript/src/types.ts; do if [ ! -f "$f" ]; then echo "MISSING: $f"; missing=$((missing+1)); fi; done`,
`if [ $missing -gt 0 ]; then echo "$missing files missing"; exit 1; fi`,
`echo "All files present, build passed"`,
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

// ── Step 7: Review ──
.step('review', {
agent: 'reviewer',
dependsOn: ['commit'],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Review step runs after commit+push, so it cannot gate bad code

The review step (line 222) has dependsOn: ['commit'], meaning the code is committed and pushed to the remote branch before the reviewer agent runs. If the review finds issues (e.g., broken handler patterns, missing tests, incorrect types), the code is already pushed. This defeats the purpose of the review.

In contrast, the existing relayfile-knowledge-extraction.ts:205-225 correctly orders commit after the final review step (dependsOn: ['final-review']), ensuring review gates the commit.

Prompt for agents
In workflows/knowledge-extraction.ts, swap the dependency order of the commit and review steps so that review gates the commit:

1. Line 209: Change the commit step's dependsOn from ['update-openapi'] to ['review']
2. Line 224: Change the review step's dependsOn from ['commit'] to ['update-openapi']

This ensures that code is only committed and pushed after the reviewer has checked it.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant