QueryEngine — backed by a React/Ink terminal UI and a rich tool registry.
The QueryEngine
QueryEngine (src/QueryEngine.ts, ~46K lines) is the engine that drives every conversation. One instance is created per conversation; each submitMessage() call starts a new turn within the same session while preserving state — messages, file cache, token usage — across turns.
Key responsibilities:
| Concern | Detail |
|---|---|
| Streaming responses | Streams chunks from the Anthropic API as they arrive, updating the UI progressively |
| Tool-call loop | After each streamed response, executes all requested tool calls, then re-queries the model with the results |
| Thinking mode | Configurable ThinkingConfig enables extended thinking (budget tokens) for complex tasks |
| Retry logic | Retryable API errors (rate limits, transient failures) are caught via categorizeRetryableAPIError and retried automatically |
| Token counting | Tracks cumulative usage via accumulateUsage / updateUsage; exposes cost via getTotalCost() and getModelUsage() |
The Tool-Call Loop
Every turn follows this sequence:User message submitted
submitMessage() is called with the user’s input. The message is normalized and appended to mutableMessages.System prompt assembled
fetchSystemPromptParts() and getUserContext() build the full system prompt, including memory content, working directory, and any custom prompts.API query
The
query() function streams a response from the Anthropic API. Streaming chunks are yielded to the UI in real time.Tool calls executed
If the response contains
tool_use blocks, each tool is run through the permission check, executed, and its result appended as a tool_result message.Parallel Startup Optimization
Startup time is minimized by firing side-effects before heavy module evaluation begins inmain.tsx:
Lazy Loading
Two large native modules are deferred via dynamicimport() until they are actually needed:
| Module | Approximate size | When loaded |
|---|---|---|
| OpenTelemetry | ~400 KB | First telemetry event |
| gRPC | ~700 KB | First gRPC connection (e.g., coordinator mode) |
Feature Flags
Claude Code uses Bun’sbun:bundle feature-flag mechanism for dead-code elimination. Inactive flags are completely stripped at build time — the code doesn’t just branch, it’s removed from the bundle entirely.
| Flag | Description |
|---|---|
PROACTIVE | Enables proactive / background-agent mode |
KAIROS | Long-lived assistant mode with append-only daily memory logs |
BRIDGE_MODE | IDE bridge (VS Code, JetBrains) communication layer |
DAEMON | Persistent daemon process for faster subsequent launches |
VOICE_MODE | Voice input support |
AGENT_TRIGGERS | Scheduled cron triggers and remote triggers for agents |
MONITOR_TOOL | Monitoring tool |
EXTRACT_MEMORIES | Background memory-extraction agent |
COORDINATOR_MODE | Multi-agent coordinator |
BASH_CLASSIFIER | Auto-classifier for Bash permission decisions |
TEAMMEM | Team-shared memory sync |
Context Collection
Before every API call the system prompt is assembled from two sources:getSystemContext()— static environment facts: OS, shell, working directory, date/time, Claude Code version, available tools list, and project-levelCLAUDE.mdcontents.getUserContext()— dynamic per-turn context: memory prompt (fromloadMemoryPrompt()), coordinator context, and any appended system prompt provided via config.
The memory prompt is injected into the user context rather than the system prompt to keep the system prompt cache prefix stable across turns, which reduces API costs.