Agent System Overview

Longshot uses a hierarchical multi-agent system to decompose large software projects into parallel, independently executable tasks. This architecture enables massive concurrency while maintaining code quality and project coherence.

System Architecture

The system consists of four specialized agent types, each with a distinct role:

Agent Roles

Agent	Purpose	Writes Code	Scope
Root Planner	Decomposes the project into tasks through iterative discovery	No	Entire project
Subplanner	Breaks complex tasks into parallelizable subtasks	No	Delegated task slice
Worker	Implements individual tasks on isolated branches	Yes	1-5 files
Reconciler	Monitors build/test health and creates fix tasks	No	Main branch

Key Design Principles

1. Iterative Discovery Over Upfront Planning

Unlike traditional planning systems that attempt to enumerate all work upfront, Longshot agents use iterative discovery:

Sprint-based planning: Each planning iteration produces only what can be confidently specified given current knowledge
Feedback-driven adaptation: Completed work informs the next planning iteration
Progressive refinement: As foundations are built, agents can specify increasingly detailed tasks

2. Conversation-Based Memory

All planning agents operate as persistent conversations, not stateless function calls:

Agents maintain a scratchpad that survives across iterations
Context includes conversation history, previous responses, and accumulated knowledge
Agents reference their own prior analysis and build on it incrementally

3. Hierarchical Decomposition

Tasks flow through a multi-level decomposition pipeline:

Project Request
    ↓
[Root Planner] → Task (scope: 15 files)
    ↓
[Subplanner] → 4 Subtasks (scope: 3-4 files each)
    ↓
[Workers] → Implementations on isolated branches
    ↓
[Merge Queue] → Sequential merge with conflict resolution

4. Scope Containment

Scope inheritance is strictly enforced:

Subplanner subtasks must be subsets of the parent scope
Workers cannot modify files outside their task scope
Scope violations break the merge pipeline and cause untraceable conflicts

The Handoff Protocol

Communication between agents uses a structured handoff format:

interface Handoff {
  taskId: string;
  status: "complete" | "partial" | "blocked" | "failed";
  summary: string;  // What was done and how
  diff: string;     // Git diff of changes
  filesChanged: string[];
  concerns: string[];     // Risks, unexpected findings
  suggestions: string[];  // Ideas for follow-up work
  metrics: {
    linesAdded: number;
    linesRemoved: number;
    filesCreated: number;
    filesModified: number;
    tokensUsed: number;
    toolCallCount: number;
    durationMs: number;
  };
}

Why Handoffs Matter

Handoffs are the system’s primary feedback mechanism:

Planners learn from worker experiences: Concerns about missing utilities become the next sprint’s tasks
Quality signals propagate: Patterns workers establish inform subsequent task specifications
Failures guide adaptation: Blocked tasks produce targeted follow-ups, not wholesale retries

Tools and Capabilities

Planning Agents (Root Planner & Subplanner)

Read-only exploration tools:

read — Read file contents by path
grep — Search file contents with regex patterns
find — Find files by glob pattern
ls — List directory contents
bash — Execute read-only git commands (git log, git diff, git show, etc.)

Planning agents use these tools to explore the codebase before producing task specifications, ensuring accurate scoping based on actual code structure.

Worker Agents

Full coding capabilities via pi-coding-agent:

read — Read files
write — Create/overwrite files
edit — Apply precise edits to existing files
bash — Execute shell commands (git, npm, build tools)
grep — Search content with ripgrep
find — Find files by glob
ls — List directories

Workers run in isolated sandbox environments with full filesystem and tool access.

Reconciler Agent

Build/test execution:

Runs tsc --noEmit to check TypeScript compilation
Runs npm run build to verify bundling/build pipeline
Runs npm test to execute test suites
Scans for merge conflict markers
Calls LLM to generate targeted fix tasks based on failure output

The Scratchpad System

The scratchpad is the agent’s durable working memory:

Root Planner Scratchpad

Must track:

Goals & Specs: Full goal set from SPEC.md/FEATURES.json, coverage status
Current State: Iteration number, phase, what’s built/broken/in-progress
Sprint Reasoning: Why this batch of tasks, what’s deferred and why
Worker Intelligence: Patterns from handoffs, unresolved concerns

Subplanner Scratchpad

Must track:

Parent goal: The parent task’s acceptance criteria
Scope coverage: Which files are addressed/pending/deferred
Subtask status: completed/failed/in-progress for each subtask
Discoveries: Patterns learned from handoffs
What’s deferred: Parts held back and why
Concerns: Worker-reported issues needing follow-up

Why Scratchpads Matter

Survive context compaction: When conversation history grows large, older messages are compacted but the scratchpad persists
Prevent drift: Forces agents to maintain coherent understanding across iterations
Enable introspection: Agents can reference their own prior reasoning

Execution Model

1. Planning Loop (Root Planner)

while (not done && iteration < maxIterations) {
  // Collect completed handoffs
  collectCompletedHandoffs();

  // Decide if planning is needed
  if (hasCapacity && (firstIteration || enoughHandoffs || noActiveWork)) {
    // Read current repo state
    const repoState = await readRepoState();

    // Prompt LLM for next batch of tasks
    const tasks = await plan(request, repoState, newHandoffs);

    // Dispatch tasks to workers or subplanners
    dispatchTasks(tasks);
  }

  await sleep(LOOP_SLEEP_MS);
}

Key characteristics:

Triggers replanning when:
- First iteration (initial planning)
- N handoffs received since last plan (default: 3)
- No active work remains but project incomplete
Uses delta optimization: Only sends changed repo state in follow-ups (saves ~40K chars/iteration)

2. Decomposition (Subplanner)

Complex tasks (scope > 4 files) are routed through a subplanner:

async function decomposeAndExecute(parentTask, depth) {
  // Iterative planning loop (similar to root planner)
  while (iteration < maxIterations) {
    if (needsPlan) {
      const subtasks = await plan(parentTask, repoState, handoffs);
      
      // Dispatch subtasks (may recurse if still complex)
      for (const subtask of subtasks) {
        if (shouldDecompose(subtask, depth + 1)) {
          // Recursive decomposition (max depth: 3)
          await decomposeAndExecute(subtask, depth + 1);
        } else {
          // Atomic subtask → direct to worker
          await workerPool.assignTask(subtask);
        }
      }
    }
  }

  // Aggregate subtask handoffs into parent handoff
  return aggregateHandoffs(parentTask, subtasks, handoffs);
}

Key characteristics:

Maximum depth: 3 levels of recursive decomposition
Scope threshold: Tasks with < 4 files go directly to workers
Atomic bailout: If LLM returns no subtasks on first iteration, task goes to worker as-is

3. Worker Execution (Sandbox)

Workers run in isolated Modal sandboxes:

// 1. Initialize Pi agent session with full tool suite
const { session } = await createAgentSession({
  cwd: WORK_DIR,
  model,
  tools: [read, write, edit, bash, grep, find, ls],
});

// 2. Prompt agent with task details
await session.prompt(buildTaskPrompt(task));

// 3. Extract metrics
const stats = session.getSessionStats();
const tokensUsed = stats.tokens.total;

// 4. Safety-net commit (if agent didn't commit)
safeExec("git add -A", WORK_DIR);
safeExec(`git commit -m "feat(${task.id}): auto-commit"`), WORK_DIR);

// 5. Extract diff and build handoff
const diff = safeExec(`git diff ${startSha} --no-color`, WORK_DIR);
return buildHandoff(task, diff, stats);

Key characteristics:

Isolated environment: Each worker gets a fresh sandbox with cloned repo
Full Pi capabilities: All 7 Pi tools (not just the limited 4-tool codingTools set)
Safety-net commit: Uncommitted changes are auto-committed before diff extraction
Post-agent build check: Runs tsc --noEmit to detect compile failures early

4. Reconciliation Loop

Reconciler runs periodically (default: 5 minutes, adaptive):

async function sweep() {
  // Run build checks
  const tscResult = await runCommand(["npx", "tsc", "--noEmit"]);
  const buildResult = await runCommand(["npm", "run", "build"]);
  const testResult = await runCommand(["npm", "test"]);
  
  // Scan for conflict markers
  const conflictFiles = await findConflictMarkers();

  if (allGreen) {
    return { fixTasks: [] };
  }

  // Call LLM to generate fix tasks from error output
  const fixTasks = await generateFixTasks({
    buildOutput,
    testOutput,
    conflictFiles,
    recentCommits,
  });

  return { fixTasks };
}

Adaptive interval:

Starts at maxIntervalMs (5 minutes)
Drops to minIntervalMs (1 minute) when errors detected
Returns to maxIntervalMs after 3 consecutive green sweeps

Extension Points

1. Custom System Prompts

Each agent type accepts a custom system prompt:

const planner = new Planner(
  config,
  plannerConfig,
  taskQueue,
  workerPool,
  mergeQueue,
  monitor,
  customSystemPrompt,  // ← Your prompt
  subplanner,
);

Prompts are loaded from prompts/*.md files:

prompts/root-planner.md
prompts/subplanner.md
prompts/worker.md
prompts/reconciler.md

2. Agent Callbacks

All agents expose lifecycle hooks:

planner.onTaskCreated((task) => {
  console.log(`New task: ${task.id}`);
});

planner.onTaskCompleted((task, handoff) => {
  console.log(`Completed: ${task.id} (${handoff.status})`);
});

planner.onIterationComplete((iteration, tasks, handoffs) => {
  console.log(`Iteration ${iteration}: ${tasks.length} tasks`);
});

planner.onError((error) => {
  console.error(`Planning error: ${error.message}`);
});

3. Dependency Injection

Planner and Reconciler accept dependency injection for testing:

interface PlannerDeps {
  createPlannerPiSession: typeof createPlannerPiSession;
  cleanupPiSession: typeof cleanupPiSession;
  parsePlannerResponse: typeof parsePlannerResponse;
  readRepoState: typeof readRepoState;
  sleep: typeof sleep;
  slugifyForBranch: typeof slugifyForBranch;
  now: () => number;
}

const planner = new Planner(
  config,
  plannerConfig,
  taskQueue,
  workerPool,
  mergeQueue,
  monitor,
  systemPrompt,
  subplanner,
  customDeps,  // ← Inject test doubles
);

Next Steps

Root Planner — Detailed root planner behavior and prompt engineering
Subplanner — Recursive decomposition and scope management
Worker — Implementation agent design and sandbox execution
Reconciler — Build/test monitoring and fix task generation

Overview

Getting Started

Core Concepts

Guides

Agent Development

Examples

Agent System Overview

System Architecture

Agent Roles

Key Design Principles

1. Iterative Discovery Over Upfront Planning

2. Conversation-Based Memory

3. Hierarchical Decomposition

4. Scope Containment

The Handoff Protocol

Why Handoffs Matter

Tools and Capabilities

Planning Agents (Root Planner & Subplanner)

Worker Agents

Reconciler Agent

The Scratchpad System

Root Planner Scratchpad

Subplanner Scratchpad

Why Scratchpads Matter

Execution Model

1. Planning Loop (Root Planner)

2. Decomposition (Subplanner)

3. Worker Execution (Sandbox)

4. Reconciliation Loop

Extension Points

1. Custom System Prompts

2. Agent Callbacks

3. Dependency Injection

Next Steps

Build docs developers (and LLMs) love

Overview

Getting Started

Core Concepts

Guides

Agent Development

Examples

​System Architecture

​Agent Roles

​Key Design Principles

​1. Iterative Discovery Over Upfront Planning

​2. Conversation-Based Memory

​3. Hierarchical Decomposition

​4. Scope Containment

​The Handoff Protocol

​Why Handoffs Matter

​Tools and Capabilities

​Planning Agents (Root Planner & Subplanner)

​Worker Agents

​Reconciler Agent

​The Scratchpad System

​Root Planner Scratchpad

​Subplanner Scratchpad

​Why Scratchpads Matter

​Execution Model

​1. Planning Loop (Root Planner)

​2. Decomposition (Subplanner)

​3. Worker Execution (Sandbox)

​4. Reconciliation Loop

​Extension Points

​1. Custom System Prompts

​2. Agent Callbacks

​3. Dependency Injection

​Next Steps

Build docs developers (and LLMs) love

System Architecture

Agent Roles

Key Design Principles

1. Iterative Discovery Over Upfront Planning

2. Conversation-Based Memory

3. Hierarchical Decomposition

4. Scope Containment

The Handoff Protocol

Why Handoffs Matter

Tools and Capabilities

Planning Agents (Root Planner & Subplanner)

Worker Agents

Reconciler Agent

The Scratchpad System

Root Planner Scratchpad

Subplanner Scratchpad

Why Scratchpads Matter

Execution Model

1. Planning Loop (Root Planner)

2. Decomposition (Subplanner)

3. Worker Execution (Sandbox)

4. Reconciliation Loop

Extension Points

1. Custom System Prompts

2. Agent Callbacks

3. Dependency Injection

Next Steps