Skip to main content
Claude Code is a REPL-based agentic loop. You type a message, the LLM responds, Claude invokes tools, results flow back to the LLM, and the cycle continues until the task is complete or you interrupt it. The entire session is orchestrated by a single core class — QueryEngine — backed by a React/Ink terminal UI and a rich tool registry.

The QueryEngine

QueryEngine (src/QueryEngine.ts, ~46K lines) is the engine that drives every conversation. One instance is created per conversation; each submitMessage() call starts a new turn within the same session while preserving state — messages, file cache, token usage — across turns. Key responsibilities:
ConcernDetail
Streaming responsesStreams chunks from the Anthropic API as they arrive, updating the UI progressively
Tool-call loopAfter each streamed response, executes all requested tool calls, then re-queries the model with the results
Thinking modeConfigurable ThinkingConfig enables extended thinking (budget tokens) for complex tasks
Retry logicRetryable API errors (rate limits, transient failures) are caught via categorizeRetryableAPIError and retried automatically
Token countingTracks cumulative usage via accumulateUsage / updateUsage; exposes cost via getTotalCost() and getModelUsage()

The Tool-Call Loop

Every turn follows this sequence:
1

User message submitted

submitMessage() is called with the user’s input. The message is normalized and appended to mutableMessages.
2

System prompt assembled

fetchSystemPromptParts() and getUserContext() build the full system prompt, including memory content, working directory, and any custom prompts.
3

API query

The query() function streams a response from the Anthropic API. Streaming chunks are yielded to the UI in real time.
4

Tool calls executed

If the response contains tool_use blocks, each tool is run through the permission check, executed, and its result appended as a tool_result message.
5

Loop or return

If tools were called, the model is queried again with the results. This repeats until the model emits a final text response with no tool calls (or maxTurns is reached).
User input


QueryEngine.submitMessage()


fetchSystemPromptParts() + getUserContext()


query() ──► Anthropic API (streaming)

    ├── Text block ──► render to terminal

    └── tool_use blocks


        checkPermissions()


        tool.call()


        tool_result messages


        re-query ──► (loop back to API)

Parallel Startup Optimization

Startup time is minimized by firing side-effects before heavy module evaluation begins in main.tsx:
// main.tsx — fired before other imports resolve
startMdmRawRead()       // prefetch MDM / managed-device settings
startKeychainPrefetch() // warm macOS Keychain reads
These run concurrently with the Commander.js CLI parse and React/Ink renderer initialization, so by the time the REPL is ready the slowest I/O (keychain, MDM) has usually already completed.

Lazy Loading

Two large native modules are deferred via dynamic import() until they are actually needed:
ModuleApproximate sizeWhen loaded
OpenTelemetry~400 KBFirst telemetry event
gRPC~700 KBFirst gRPC connection (e.g., coordinator mode)
This keeps the cold-start time low even on slower machines.

Feature Flags

Claude Code uses Bun’s bun:bundle feature-flag mechanism for dead-code elimination. Inactive flags are completely stripped at build time — the code doesn’t just branch, it’s removed from the bundle entirely.
import { feature } from 'bun:bundle'

// Inactive code is completely stripped at build time
const voiceCommand = feature('VOICE_MODE')
  ? require('./commands/voice/index.js').default
  : null
Notable flags found in the source:
FlagDescription
PROACTIVEEnables proactive / background-agent mode
KAIROSLong-lived assistant mode with append-only daily memory logs
BRIDGE_MODEIDE bridge (VS Code, JetBrains) communication layer
DAEMONPersistent daemon process for faster subsequent launches
VOICE_MODEVoice input support
AGENT_TRIGGERSScheduled cron triggers and remote triggers for agents
MONITOR_TOOLMonitoring tool
EXTRACT_MEMORIESBackground memory-extraction agent
COORDINATOR_MODEMulti-agent coordinator
BASH_CLASSIFIERAuto-classifier for Bash permission decisions
TEAMMEMTeam-shared memory sync

Context Collection

Before every API call the system prompt is assembled from two sources:
  • getSystemContext() — static environment facts: OS, shell, working directory, date/time, Claude Code version, available tools list, and project-level CLAUDE.md contents.
  • getUserContext() — dynamic per-turn context: memory prompt (from loadMemoryPrompt()), coordinator context, and any appended system prompt provided via config.
The memory prompt is injected into the user context rather than the system prompt to keep the system prompt cache prefix stable across turns, which reduces API costs.

Build docs developers (and LLMs) love