Overview
Recursive Language Models (RLM) solve a fundamental problem: LLMs have fixed context windows, but real-world inputs can be arbitrarily long. Instead of cramming the full input into the prompt, an RLM gives the model a REPL and a symbolic handle to the input. The model writes JavaScript to explore, chunk, and recursively process the data.The Core Idea
In a standard LLM call:Quick Start
Configuration
Configure the RLM harness behavior:Choosing Values
- maxIterations: 10 works well. Simple tasks finish in 1-3 turns, complex ones take 5-8.
- maxStdoutLength: 4000 (default) prevents context overflow from debug output.
- metadataPrefixLength: 200 gives enough orientation. Increase if the beginning matters.
- maxDepth: 2 allows
llm_query→ childllm_query→ flat call. Prevents infinite recursion.
The REPL Environment
The model has access to:| Name | Type | Description |
|---|---|---|
context | string | The user’s input as a plain JavaScript string |
llm_query(prompt, context?) | (string, string?) => Promise<string> | Spawn a sub-agent with its own REPL. prompt is the task, context is optional data |
exec(command, timeout?) | (string, number?) => Promise<{ stdout, stderr, exitCode }> | Execute a shell command |
FINAL(answer) | (unknown) => void | Emit the final answer and stop |
console.log(...args) | (...unknown[]) => void | Print to stdout (shown back to model) |
scope | Record<string, unknown> | Persistent state across REPL turns |
Variables Persist
Assign toscope to preserve state:
Example: Document Summarization
Example: Shell Command Execution
The REPL includesexec() for running shell commands:
The Inference Loop
The RLM harness runs this loop:Event Types
| Event | Description | Fields |
|---|---|---|
harness_start | RLM session started | runId, depth?, maxIterations? |
text | Streamed LLM response or final answer | id, runId, content |
reasoning | Streamed reasoning tokens | id, runId, content |
repl_input | Code about to execute | id, runId, code, iteration? |
repl_progress | Live REPL output | id, runId, chunk, stream (“stdout”/“stderr”) |
repl_output | Complete execution result | id, runId, stdout, error?, done, iteration?, durationMs?, truncated? |
usage | Token usage | runId, inputTokens, outputTokens |
error | Error (e.g., code extraction failed) | runId, message |
harness_end | Session complete | runId, reason?, iterations?, totalUsage? |
relay | Permission request for exec() | id, runId, kind: "permission", tool: "exec", params |
Recursive Queries
Thellm_query() function spawns child RLM sessions:
llm_query call:
- Spawns a child RLM harness with
depth = parent.depth + 1 - Gives the child its own REPL with the provided
context - Returns the child’s final answer as a string
parentId to preserve the call graph.
Depth Limits
SetmaxDepth to prevent infinite recursion:
depth >= maxDepth, llm_query falls back to a flat one-shot call (no REPL).
Separate Sub-Harness
Use a cheaper model forllm_query calls:
Permission Gating for exec()
Whenpermissions are provided, exec() calls are checked:
permissions, exec() runs freely (backward compatible).
Error Recovery
REPL errors don’t crash the session:Stopping Conditions
- FINAL() called: Model emits final answer, loop breaks
- maxIterations reached: Loop stops, no final answer (handle gracefully)
- Error: Yielded as
errorevent, session ends
Code Extraction
The harness expects exactly one code block per LLM response:error event is yielded.
Debugging
Enable logging:Use Cases
Document Processing
Summarize, extract, or analyze documents longer than the context window
Codebase Analysis
Search, refactor, or audit large codebases by chunking and delegating
Log Analysis
Parse and aggregate insights from massive log files
Data Processing
Transform, filter, or aggregate large datasets programmatically
Limitations
- Model capability: RLM requires a model that writes correct JavaScript. Works best with capable models (GPT-4, Claude 3.5, DeepSeek, Kimi).
- Iteration budget: Complex tasks may hit
maxIterations. Increase if needed. - REPL sandbox: Limited to
AsyncFunction. No access torequire,process, or other Node/Bun globals.
Next Steps
Multi-Agent
Combine RLM with the orchestrator for concurrent agents
Client Rendering
Render RLM events in a UI
