Architecture

Codebuff is built on a multi-agent orchestration architecture where specialized AI agents coordinate to complete coding tasks. Instead of using a single large model for everything, Codebuff coordinates multiple purpose-built agents that work together.

High-Level Overview

When you ask Codebuff to make a change, here’s what happens:

Base Agent receives your request and plans the approach
Specialized agents are spawned to gather context (file-picker, code-searcher)
Editor Agent implements the changes based on gathered context
Reviewer Agent validates the changes
Commander Agent runs tests and validation commands

This orchestration approach gives you better context understanding, more accurate edits, and fewer errors compared to single-model tools.

Core Components

CLI (Command Line Interface)

The CLI is the primary user interface for Codebuff. Located in cli/src/, it provides:

Interactive chat interface for natural language coding requests
Command system (/init, /usage, etc.) for workspace management
Real-time feedback as agents work on your code
Git integration for tracking changes

Key files:

cli/src/index.tsx - Entry point
cli/src/chat.tsx - Main chat interface (49KB, handles all interaction)
cli/src/app.tsx - Application shell

SDK (Software Development Kit)

The SDK (sdk/src/) provides the core runtime for executing agents and can be embedded in any Node.js application:

import { CodebuffClient } from '@codebuff/sdk'

const client = new CodebuffClient({
  apiKey: 'your-api-key',
  cwd: '/path/to/project',
})

const result = await client.run({
  agent: 'base',
  prompt: 'Add error handling to all API endpoints',
})

Key SDK files:

sdk/src/client.ts - Main client interface
sdk/src/run.ts - Agent execution engine (23KB)
sdk/src/run-state.ts - State management (22KB)
sdk/src/tools/ - Built-in tool implementations

Agents

Agents are the specialized workers in Codebuff. Each agent is defined in agents/ and has:

Unique purpose (finding files, editing code, reviewing changes)
Specific tools it can use
Custom prompts tailored to its task
Ability to spawn sub-agents for delegation

See Agents for detailed information on each agent type.

Tools

Tools are the actions agents can perform, implemented in sdk/src/tools/:

File operations - read_files, write_file, str_replace
Code analysis - code_search, glob, list_directory
Terminal - run_terminal_command
Agent spawning - spawn_agents

See Tools for the complete tool reference.

Data Flow

Here’s how data flows through Codebuff during a typical request:

┌─────────────┐
│    User     │
│   Prompt    │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────────┐
│         Base Agent (base2)          │
│  - Receives prompt                  │
│  - Plans approach                   │
│  - Coordinates subagents            │
└──────┬──────────────────────────────┘
       │
       ├─────────────────────────┬──────────────────┐
       ▼                         ▼                  ▼
┌────────────┐          ┌──────────────┐   ┌────────────┐
│File Picker │          │Code Searcher │   │ Commander  │
│ (parallel) │          │  (parallel)  │   │  (tests)   │
└─────┬──────┘          └──────┬───────┘   └─────┬──────┘
      │                        │                  │
      └────────────┬───────────┘                  │
                   ▼                              │
           ┌───────────────┐                      │
           │ Context Pool  │                      │
           │ (files read)  │                      │
           └───────┬───────┘                      │
                   ▼                              │
           ┌───────────────┐                      │
           │ Editor Agent  │                      │
           │ (implements)  │                      │
           └───────┬───────┘                      │
                   ▼                              │
           ┌───────────────┐                      │
           │ Code Reviewer │◄─────────────────────┘
           │  (validates)  │
           └───────┬───────┘
                   ▼
           ┌───────────────┐
           │   Response    │
           │   to User     │
           └───────────────┘

Execution Model

Agent Lifecycle

Each agent follows this lifecycle:

Initialization - Agent receives prompt and params
handleSteps execution - Generator function controls agent behavior
Tool calls - Agent uses tools to read files, spawn agents, etc.
LLM steps - Agent sends messages to LLM for decision-making
Output - Agent returns result (last_message, all_messages, or structured_output)

Step-by-Step Execution

Agents use TypeScript generator functions to control execution:

handleSteps: function* ({ agentState, prompt, params, logger }) {
  // Step 1: Read some files
  yield {
    toolName: 'read_files',
    input: { paths: ['src/index.ts'] }
  }
  
  // Step 2: Let the LLM think and respond
  yield 'STEP'
  
  // Step 3: Spawn a subagent
  yield {
    toolName: 'spawn_agents',
    input: {
      agents: [{
        agent_type: 'editor',
        prompt: 'Add error handling'
      }]
    }
  }
  
  // Step 4: Another LLM step
  yield 'STEP'
}

Context Management

Codebuff automatically manages context through the Context Pruner agent:

Runs automatically between steps
Summarizes conversation when context limit is approaching
Preserves important information (user messages, file changes, errors)
Triggers on cache misses (>5 min gaps) to take advantage of fresh context

From agents/context-pruner.ts:

const maxContextLength: number = params?.maxContextLength ?? 200_000
// Prune when exceeding limit OR when cache will miss
if (agentState.contextTokenCount + TOKEN_COUNT_FUDGE_FACTOR <= maxContextLength && !cacheWillMiss) {
  // No pruning needed
  return
}
// Otherwise, summarize conversation history...

Model Selection

Different agents use different models optimized for their tasks:

Base Agent (base2): claude-opus-4.6 or minimax-m2.5 (free mode)
Editor: claude-opus-4.6 or gpt-5.1
File Picker: gemini-2.5-flash-lite (fast, cost-effective)
Commander: gemini-3.1-flash-lite-preview (quick command execution)
Context Pruner: gpt-5-mini (efficient summarization)

Codebuff supports any model on OpenRouter. You can customize agents to use different models based on your needs and budget.

Extension Points

Codebuff is designed to be extended:

Custom Agents

Create agents in .agents/ directory:

export default {
  id: 'my-agent',
  displayName: 'My Custom Agent',
  model: 'openai/gpt-5.1',
  toolNames: ['read_files', 'write_file'],
  instructionsPrompt: 'Your instructions here...',
}

Custom Tools

Provide custom tool implementations via the SDK:

const customTool = {
  name: 'my_tool',
  description: 'Does something custom',
  inputSchema: { /* ... */ },
  handler: async (params) => {
    // Tool implementation
    return [{ type: 'json', value: result }]
  }
}

await client.run({
  agent: 'base',
  customToolDefinitions: [customTool],
})

MCP Servers

Integrate Model Context Protocol servers for external tools:

const agent = {
  id: 'my-agent',
  mcpServers: {
    'filesystem': {
      command: 'npx',
      args: ['-y', '@modelcontextprotocol/server-filesystem', '/path']
    }
  },
  toolNames: ['filesystem/read_file', 'filesystem/write_file'],
}

Performance Considerations

Parallel Agent Spawning

The base agent spawns multiple agents in parallel when they don’t depend on each other:

// From base2.ts system prompt:
// "Spawn multiple agents in parallel: This increases the speed 
// of your response AND allows you to be more comprehensive"

Example: Spawning 3 file-pickers simultaneously to explore different parts of the codebase.

Prompt Caching

Agents leverage prompt caching (Anthropic’s 5-minute cache window):

System prompts are cached
File tree is cached
Context pruner triggers on cache misses to refresh context

Token Efficiency

Smaller, faster models are used for routine tasks:

File discovery uses Gemini Flash
Command execution uses Gemini Flash Lite
Only complex reasoning uses Opus/GPT-5

Next Steps

Agents

Learn about each specialized agent

Multi-Agent Orchestration

See how agents work together

Tools

Explore available tools

Creating Agents

Build your own agents

Get Started

Core Concepts

CLI Guide

SDK Guide

Agent Development

Advanced

Architecture

Architecture

High-Level Overview

Core Components

CLI (Command Line Interface)

SDK (Software Development Kit)

Agents

Tools

Data Flow

Execution Model

Agent Lifecycle

Step-by-Step Execution

Context Management

Model Selection

Extension Points

Custom Agents

Custom Tools

MCP Servers

Performance Considerations

Parallel Agent Spawning

Prompt Caching

Token Efficiency

Next Steps

Agents

Multi-Agent Orchestration

Tools

Creating Agents

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Guide

SDK Guide

Agent Development

Advanced

​Architecture

​High-Level Overview

​Core Components

​CLI (Command Line Interface)

​SDK (Software Development Kit)

​Agents

​Tools

​Data Flow

​Execution Model

​Agent Lifecycle

​Step-by-Step Execution

​Context Management

​Model Selection

​Extension Points

​Custom Agents

​Custom Tools

​MCP Servers

​Performance Considerations

​Parallel Agent Spawning

​Prompt Caching

​Token Efficiency

​Next Steps

Agents

Multi-Agent Orchestration

Tools

Creating Agents

Build docs developers (and LLMs) love

Architecture

High-Level Overview

Core Components

CLI (Command Line Interface)

SDK (Software Development Kit)

Agents

Tools

Data Flow

Execution Model

Agent Lifecycle

Step-by-Step Execution

Context Management

Model Selection

Extension Points

Custom Agents

Custom Tools

MCP Servers

Performance Considerations

Parallel Agent Spawning

Prompt Caching

Token Efficiency

Next Steps