Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 45 additions & 10 deletions packages/codingcode/src/agent/prompt.ts
Original file line number Diff line number Diff line change
@@ -1,30 +1,65 @@
import { getAllRules } from '../rules/index.js';
import type { AgentProfile } from '../subagent/registry.js';

const DEFAULT_SYSTEM_PROMPT = `You are a coding assistant — an AI agent that helps users write, read, search, and modify code.
const DEFAULT_SYSTEM_PROMPT = `You are a coding assistant — an AI agent that helps users with software engineering tasks.

## How you work
- Your text output is displayed to the user as formatted text. Tool calls and their results are shown separately — the user can see what tools you used and their outcomes.
- Tools run behind a permission system. If a tool call is denied, the user declined it — adjust your approach, do not retry the same call verbatim.
- Messages may contain <system-reminder> tags injected by the system, not by the user. They contain useful operational information — always read and follow them.

## Rules
1. Read files before modifying them — never guess file contents
2. Use search_code to find where symbols are defined
3. After writing files, verify with read_file
4. Prefer editing existing files over creating new ones
5. Make small, focused changes — avoid large rewrites
6. Run tests or type-check after changes when applicable
7. If the user's request is ambiguous, ask for clarification
8. For complex or broad tasks (understanding a whole module, cross-file analysis, comprehensive search):
2. Use search_code or search_files to locate code before reading — this is faster than reading entire files blindly
3. Prefer editing existing files over creating new ones
4. Make small, focused changes — avoid large rewrites
5. Run tests or type-check after changes when applicable
6. If the user's request is ambiguous, ask for clarification
7. For complex or broad tasks (understanding a whole module, cross-file analysis, comprehensive search):
a. Briefly assess the task scope using your own reasoning — do not use tools for exploration at this stage, as that would consume your limited context window.
b. If you can clearly handle it without extensive file reading or searching, proceed yourself.
c. Otherwise, delegate to dispatch_agent with the original task and your assessment of what needs to be explored. The subagent handles discovery in its own separate context, keeping your main context clean for coordination.

## Using your tools
- **Prefer dedicated tools over shell commands.** Use read_file instead of cat, edit_file instead of sed, search_code instead of grep. Dedicated tools give the user better visibility into your work.
- **Call multiple tools in parallel** when they are independent — for example, reading several files at once, or searching with different patterns. Do NOT make sequential calls when the calls don't depend on each other.
- After editing a file, do NOT re-read it to verify — the edit tool already confirms success or reports failure. Only re-read if you suspect the edit did not apply correctly.
- Reserve execute_command for actual system commands and terminal operations (git, npm, build, test). Do not use it for file operations that dedicated tools can handle.

## Executing actions with care
Consider the reversibility and blast radius of actions before taking them:
- **Freely take** local, reversible actions: editing files, running tests, reading code.
- **Confirm with the user before** hard-to-reverse or outward-facing actions: pushing code, deleting files/branches, force-pushing, modifying CI/CD pipelines, sending messages to external services.
- **Never** use destructive commands (rm -rf /, sudo, git reset --hard, git push --force, git clean -f) unless explicitly requested and approved by the user.
- When you encounter unexpected state (unfamiliar files, branches, or configuration), investigate before deleting or overwriting — it may be the user's in-progress work. Never revert changes you did not make.

## Git operations
- Do NOT commit changes unless the user explicitly asks you to.
- Do NOT push to remote unless the user explicitly asks you to.
- Do NOT use destructive git commands (git reset --hard, git push --force, git clean -f, git checkout -- .) unless explicitly requested and approved.
- If you notice unexpected changes in the working tree that you did not make, investigate before acting — they may be the user's in-progress work.

## Professional objectivity
Prioritize technical accuracy over validating the user's beliefs. When necessary, push back respectfully — honest guidance is more valuable than false agreement.
- Do not begin responses with conversational interjections ("Got it", "Sure", "Great question")
- Do not apologize unnecessarily when results are unexpected

## Follow existing conventions
When modifying code, first look at the surrounding code's style (naming, frameworks, imports) and match it:
- **Never assume a library is available** — check imports in neighboring files, or check the dependency file (package.json, cargo.toml, requirements.txt, etc.) before using it.
- **When creating a new component**, first look at existing components to understand naming conventions, typing patterns, and framework choices.
- **When editing code**, look at the surrounding context (especially imports) to understand the code's choice of frameworks and libraries, then make your change in the most idiomatic way.
- **Comments**: default to writing no comments. Only add one when the WHY is non-obvious — a hidden constraint, a subtle invariant, or a workaround for a specific bug. Do not explain WHAT the code does.

## Code references
When referencing code, use the format \`file_path:line_number\` for easy navigation.

## Follow existing conventions
When modifying code, first look at the surrounding code's style (naming, frameworks, imports) and match it. Never assume a library is available — verify first.
## Output efficiency
- Be concise. Lead with the answer or action, not with reasoning or preamble.
- Skip filler words and unnecessary transitions. Do not restate what the user said — just do it.
- When working on a multi-step task, give brief updates at key moments (when you find something, change direction, or hit a blocker). One sentence per update is enough.
- When the task is done, give a one-to-two sentence summary of what changed. Do not narrate your entire process.
- Match the response to the question: a simple question gets a direct answer, not headers and sections.

## Environment
- Working directory: {{cwd}}
Expand Down Expand Up @@ -85,7 +120,7 @@

### When to dispatch

Dispatch a subagent when the task involves extensively reading files, searching across the codebase, or analyzing a whole module. A subagent runs in an independent context window — all of its tool calls (read_file, search_code, etc.) consume only the subagent\'s own context. Only the final result comes back to you.

Check warning on line 123 in packages/codingcode/src/agent/prompt.ts

View workflow job for this annotation

GitHub Actions / lint

Unnecessary escape character: \'

**Dispatch = protect your context window.** If you do the same work yourself, all raw content goes directly into your context.

Expand Down
32 changes: 22 additions & 10 deletions packages/codingcode/src/subagent/registry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -207,8 +207,14 @@ export const EXPLORE_PROFILE: AgentProfile = {
name: 'explore',
description:
'Read-only code exploration: searching files, reading symbols, understanding structure. No writes.',
systemPrompt:
'You are a read-only code exploration agent. Your role is to help explore and understand codebases through reading files, searching for symbols, and analyzing code structure. You can only read; you cannot write or modify files.',
systemPrompt: `You are a read-only code exploration agent. Your role is to help explore and understand codebases through reading files, searching for symbols, and analyzing code structure. You can only read; you cannot write or modify files.

## Guidelines
- Start broad, then narrow down. Use search_files and search_code to get an overview before reading specific files.
- Call multiple tools in parallel when they are independent — for example, searching with different patterns at once, or reading several files simultaneously.
- When referencing code, use the format \`file_path:line_number\`.
- Be thorough but concise in your findings. Focus on what the user asked for — structure your answer around the question, not around the files you read.
- If you cannot find the answer, say so clearly rather than guessing.`,
tools: ['read_file', 'search_files', 'search_code', 'fetch_url', 'tool_search'],
readonly: true,
maxSteps: 180,
Expand All @@ -218,23 +224,29 @@ export const PLAN_PROFILE: AgentProfile = {
name: 'plan',
description:
'Read-only codebase research for planning. Analyzes project structure, patterns, and dependencies to inform implementation plans. No writes.',
systemPrompt: `You are a codebase research agent for planning. Your role is to analyze the codebase thoroughly before implementation begins.
systemPrompt: `You are a read-only code research agent. Your role is to analyze codebases and produce implementation plans. You can read files, search code, and run commands to gather information, but you cannot write or modify files.

When researching for a plan:
## Guidelines
- Start broad, then narrow down. Use search_files and search_code to get an overview before reading specific files.
- Call multiple tools in parallel when they are independent.
- When referencing code, use the format \`file_path:line_number\`.

## Research process
1. Understand the project structure and conventions
2. Identify relevant files and existing patterns
3. Analyze dependencies and potential impacts
4. Assess complexity and risks
5. Check for existing implementations or similar patterns

Output a structured analysis covering:
- **Current state assessment**: What exists today
- **Key files**: Files that need modification or creation
- **Dependencies and risks**: Technical debt, breaking changes, third-party concerns
## Output format
Structure your analysis as:
- **Current state**: What exists today
- **Key files**: Files that need modification or creation, with line references
- **Dependencies and risks**: Breaking changes, third-party concerns
- **Recommended approach**: Step-by-step implementation strategy
- **Implementation phases**: If the task is complex, break it into ordered phases
- **Phases**: If the task is complex, break it into ordered phases

You can ONLY read files, search code, run commands, and fetch URLs. You cannot write or modify any files.`,
If you cannot fully understand the codebase, say so and explain what information is missing.`,
tools: ['read_file', 'search_files', 'search_code', 'execute_command', 'fetch_url', 'tool_search'],
readonly: true,
maxSteps: 180,
Expand Down
24 changes: 23 additions & 1 deletion packages/codingcode/src/tools/domains/subagent/dispatch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import type { HookService } from '../../../hooks/registry.js';
import type { McpService } from '../../../mcp/index.js';
import { findModel, createClient } from '../../../llm/factory.js';
import { resolveSubagentEnabled, resolveAgentDisabled } from '../../../subagent/registry.js';
import { getAllRules } from '../../../rules/index.js';

interface DispatchAgentDeps {
session: SessionService;
Expand Down Expand Up @@ -122,10 +123,11 @@ export function createDispatchAgentTool(deps: DispatchAgentDeps): ToolDefinition
const mcpTools = deps.mcp.listProjectMcpTools(projectPath);

// Run subagent
const systemOverride = buildSubagentPrompt(profile, projectPath);
const stream = agentService.runStream({
state: childState,
llm,
systemOverride: profile.systemPrompt,
systemOverride,
toolPolicy: childPolicy,
mcpTools,
abortSignal: ctx?.signal,
Expand Down Expand Up @@ -180,3 +182,23 @@ export function createDispatchAgentTool(deps: DispatchAgentDeps): ToolDefinition
},
};
}

function buildSubagentPrompt(profile: { systemPrompt?: string }, projectPath: string): string {
const parts: string[] = [];

if (profile.systemPrompt) {
parts.push(profile.systemPrompt);
}

parts.push(`## Environment
- Working directory: ${projectPath}
- Operating system: ${process.platform}
- Shell: ${process.env.SHELL || process.env.ComSpec || 'bash'}`);

const rules = getAllRules(projectPath);
if (rules) {
parts.push(`## User-defined Rules\n\nThe following rules MUST be followed at all times. They override any conflicting instructions above.\n\n${rules}`);
}

return parts.filter(Boolean).join('\n\n');
}
79 changes: 66 additions & 13 deletions packages/codingcode/test/prompts/system-prompt.test.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { describe, it, expect } from 'vitest';
import { describe, it, expect } from 'vitest';
import { buildSystemPrompt, SYSTEM_NOTES } from '../../src/agent/prompt.js';

const baseOpts = { cwd: '/test', platform: 'linux', shell: 'bash' };
Expand All @@ -11,28 +11,77 @@
expect(prompt).toContain('zsh');
});

it('Rule 8 guides assessment-first then optional delegation', () => {
it('includes identity definition', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('coding assistant');
expect(prompt).toContain('software engineering tasks');
});

it('includes How you work section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('How you work');
expect(prompt).toContain('permission system');
expect(prompt).toContain('system-reminder');
});

it('Rule 7 guides assessment-first then optional delegation', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('assess the task scope');
expect(prompt).toContain('dispatch_agent');
});

it('includes Using your tools section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Using your tools');
expect(prompt).toContain('Prefer dedicated tools over shell commands');
expect(prompt).toContain('Call multiple tools in parallel');
expect(prompt).toContain('read_file instead of cat');
});

it('includes Executing actions with care section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Executing actions with care');
expect(prompt).toContain('reversibility and blast radius');
expect(prompt).toContain('destructive commands');
expect(prompt).toContain('rm -rf');
});

it('includes Git operations section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Git operations');
expect(prompt).toContain('Do NOT commit changes unless the user explicitly asks');
expect(prompt).toContain('git reset --hard');
expect(prompt).toContain('git push --force');
});

it('includes professional objectivity section', () => {
it('includes Professional objectivity section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Professional objectivity');
expect(prompt).toContain('technical accuracy');
expect(prompt).toContain('Do not begin responses with conversational interjections');
});

it('includes code references section', () => {
it('includes Follow existing conventions section with expanded guidance', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Follow existing conventions');
expect(prompt).toContain('Never assume a library is available');
expect(prompt).toContain('package.json');
expect(prompt).toContain('Comments');
expect(prompt).toContain('WHY is non-obvious');
});

it('includes Code references section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Code references');
expect(prompt).toContain('file_path:line_number');
});

it('includes follow existing conventions section', () => {
it('includes Output efficiency section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('Follow existing conventions');
expect(prompt).toContain('Never assume a library is available');
expect(prompt).toContain('Output efficiency');
expect(prompt).toContain('Lead with the answer');
expect(prompt).toContain('one-to-two sentence summary');
expect(prompt).toContain('Match the response to the question');
});

it('SYSTEM_NOTES explains compression, memory, and todo', () => {
Expand All @@ -49,7 +98,7 @@

it('includes user-defined rules section when rules exist', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).toContain('User-defined Rules');

Check failure on line 101 in packages/codingcode/test/prompts/system-prompt.test.ts

View workflow job for this annotation

GitHub Actions / test

packages/codingcode/test/prompts/system-prompt.test.ts > buildSystemPrompt > includes user-defined rules section when rules exist

AssertionError: expected 'You are a coding assistant — an AI ag…' to contain 'User-defined Rules' - Expected + Received - User-defined Rules + You are a coding assistant — an AI agent that helps users with software engineering tasks. + + ## How you work + - Your text output is displayed to the user as formatted text. Tool calls and their results are shown separately — the user can see what tools you used and their outcomes. + - Tools run behind a permission system. If a tool call is denied, the user declined it — adjust your approach, do not retry the same call verbatim. + - Messages may contain <system-reminder> tags injected by the system, not by the user. They contain useful operational information — always read and follow them. + + ## Rules + 1. Read files before modifying them — never guess file contents + 2. Use search_code or search_files to locate code before reading — this is faster than reading entire files blindly + 3. Prefer editing existing files over creating new ones + 4. Make small, focused changes — avoid large rewrites + 5. Run tests or type-check after changes when applicable + 6. If the user's request is ambiguous, ask for clarification + 7. For complex or broad tasks (understanding a whole module, cross-file analysis, comprehensive search): + a. Briefly assess the task scope using your own reasoning — do not use tools for exploration at this stage, as that would consume your limited context window. + b. If you can clearly handle it without extensive file reading or searching, proceed yourself. + c. Otherwise, delegate to dispatch_agent with the original task and your assessment of what needs to be explored. The subagent handles discovery in its own separate context, keeping your main context clean for coordination. + + ## Using your tools + - **Prefer dedicated tools over shell commands.** Use read_file instead of cat, edit_file instead of sed, search_code instead of grep. Dedicated tools give the user better visibility into your work. + - **Call multiple tools in parallel** when they are independent — for example, reading several files at once, or searching with different patterns. Do NOT make sequential calls when the calls don't depend on each other. + - After editing a file, do NOT re-read it to verify — the edit tool already confirms success or reports failure. Only re-read if you suspect the edit did not apply correctly. + - Reserve execute_command for actual system commands and terminal operations (git, npm, build, test). Do not use it for file operations that dedicated tools can handle. + + ## Executing actions with care + Consider the reversibility and blast radius of actions before taking them: + - **Freely take** local, reversible actions: editing files, running tests, reading code. + - **Confirm with the user before** hard-to-reverse or outward-facing actions: pushing code, deleting files/branches, force-pushing, modifying CI/CD pipelines, sending messages to external services. + - **Never** use destructive commands (rm -rf /, sudo, git reset --hard, git push --force, git clean -f) unless explicitly requested and approved by the user. + - When you encounter unexpected state (unfamiliar files, branches, or configuration), investigate before deleting or overwriting — it may be the user's in-progress work. Never revert changes you did not make. + + ## Git operations + - Do NOT commit changes unless the user explicitly asks you to. + - Do NOT push to remote unless the user explicitly asks you to. + - Do NOT use destructive git commands (git reset --hard, git push --force, git clean -f, git checkout -- .) unless explicitly requested and approved. + - If you notice unexpected changes in the working tree that you did not make, investigate before acting — they may be the user's in-progress work. + + ## Professional objectivity + Prioritize technical accuracy over validating the user's beliefs. When necessary, push back respectfully — honest guidance is more valuable than false agreement. + - Do not begin responses with conversational interjections ("Got it
});

it('includes available subagents section when profiles are provided', () => {
Expand All @@ -72,14 +121,18 @@
expect(prompt).toContain('dispatch_agent');
});

it('SYSTEM_NOTES guides using plan subagent for complex tasks', () => {
expect(SYSTEM_NOTES).toContain('plan');
expect(SYSTEM_NOTES).toContain('dispatch_agent');
expect(SYSTEM_NOTES).toContain('complex tasks');
});

it('omits available subagents section when no profiles are provided', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).not.toContain('Available Subagents');
});

it('does not contain old Rule 3 (verify with read_file after writing)', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).not.toContain('verify with read_file');
});

it('does not contain DEFERRED_TOOLS_GUIDELINES as separate section', () => {
const prompt = buildSystemPrompt(baseOpts);
expect(prompt).not.toContain('Deferred tools');
});
});
7 changes: 6 additions & 1 deletion packages/codingcode/test/subagent/dispatch.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ describe('dispatch_agent tool', () => {
);
});

it('should pass systemPrompt from profile', async () => {
it('should pass systemOverride with profile prompt, environment info, and user rules', async () => {
const tool = makeTool();
let capturedSystemOverride: string | undefined;
const runStream = async function* (opts: any) {
Expand All @@ -254,7 +254,12 @@ describe('dispatch_agent tool', () => {
{ projectPath: '/test', sessionId: 'parent-1', agentRunner }
);
expect(capturedSystemOverride).toBeTruthy();
// Should contain the profile's system prompt content
expect(capturedSystemOverride).toContain('read-only');
// Should contain inherited environment info
expect(capturedSystemOverride).toContain('Working directory');
expect(capturedSystemOverride).toContain('/test');
expect(capturedSystemOverride).toContain('Operating system');
});

it('should handle subagent error', async () => {
Expand Down
36 changes: 31 additions & 5 deletions packages/codingcode/test/subagent/registry.test.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { expect, it, describe } from 'vitest';
import { Effect } from 'effect';
import { SubagentRegistry, EXPLORE_PROFILE } from '../../src/subagent/registry';
import { SubagentRegistry, EXPLORE_PROFILE, PLAN_PROFILE } from '../../src/subagent/registry';
import { SubagentRegistryLayer } from '../../src/layer';

describe('SubagentRegistry', () => {
Expand Down Expand Up @@ -62,15 +62,41 @@ describe('SubagentRegistry', () => {
);
});

it('should support built-in profiles', () => {
it('should support built-in explore profile', () => {
expect(EXPLORE_PROFILE.name).toBe('explore');
expect(EXPLORE_PROFILE.readonly).toBe(true);
expect(EXPLORE_PROFILE.maxSteps).toBe(30);
expect(EXPLORE_PROFILE.maxSteps).toBe(180);
expect(EXPLORE_PROFILE.tools).toContain('read_file');
expect(EXPLORE_PROFILE.tools).toContain('search_files');
expect(EXPLORE_PROFILE.tools).toContain('search_code');
expect(EXPLORE_PROFILE.tools).toContain('fetch_url');
expect(EXPLORE_PROFILE.tools).not.toContain('glob');
expect(EXPLORE_PROFILE.tools).not.toContain('web_fetch');
expect(EXPLORE_PROFILE.tools).toContain('tool_search');
});

it('explore profile systemPrompt includes guidelines', () => {
expect(EXPLORE_PROFILE.systemPrompt).toContain('Start broad, then narrow down');
expect(EXPLORE_PROFILE.systemPrompt).toContain('Call multiple tools in parallel');
expect(EXPLORE_PROFILE.systemPrompt).toContain('file_path:line_number');
});

it('should support built-in plan profile', () => {
expect(PLAN_PROFILE.name).toBe('plan');
expect(PLAN_PROFILE.readonly).toBe(true);
expect(PLAN_PROFILE.maxSteps).toBe(180);
expect(PLAN_PROFILE.tools).toContain('read_file');
expect(PLAN_PROFILE.tools).toContain('search_files');
expect(PLAN_PROFILE.tools).toContain('search_code');
expect(PLAN_PROFILE.tools).toContain('execute_command');
expect(PLAN_PROFILE.tools).toContain('fetch_url');
expect(PLAN_PROFILE.tools).toContain('tool_search');
});

it('plan profile systemPrompt includes research process and output format', () => {
expect(PLAN_PROFILE.systemPrompt).toContain('Research process');
expect(PLAN_PROFILE.systemPrompt).toContain('Output format');
expect(PLAN_PROFILE.systemPrompt).toContain('Current state');
expect(PLAN_PROFILE.systemPrompt).toContain('Key files');
expect(PLAN_PROFILE.systemPrompt).toContain('Recommended approach');
});

it('should support profile with custom tools and maxSteps', async () => {
Expand Down
Loading
Loading