Why Monty?
Monty avoids the cost, latency, complexity, and general faff of using a full container-based sandbox for running LLM-generated code. Instead, it lets you safely run Python code written by an LLM embedded in your agent, with startup times measured in single digit microseconds — not hundreds of milliseconds.The Problem
Large Language Models can work faster, cheaper, and more reliably when they write Python (or JavaScript) code instead of relying on traditional tool calling. However, running untrusted LLM-generated code presents significant security challenges:- Containers are slow: Docker startup takes ~195ms, far too slow for real-time agent interactions
- Direct execution is dangerous: Running code via
exec()or subprocess gives full access to filesystem, network, and system resources - WASM solutions are complex: Options like Pyodide have slow cold starts (~2.8s) and complex setup requirements
What Monty Can Do
Run Python Subset
Execute a reasonable subset of Python code — enough for your agent to express what it wants to do
Secure by Default
Completely block access to the host environment: filesystem, env variables, and network access are all implemented via external function calls you control
External Functions
Call functions on the host — only functions you explicitly give it access to
Type Checking
Full modern Python type hints support with ty included in a single binary
Snapshotting
Snapshot to bytes at external function calls, store interpreter state in a file or database, and resume later
Fast Startup
Startup extremely fast (<1μs from code to execution result), with runtime performance similar to CPython (generally between 5x faster and 5x slower)
Multi-Language Support
Call from Rust, Python, or JavaScript — Monty has no dependencies on CPython, so you can use it anywhere you can run Rust
Resource Control
Track memory usage, allocations, stack depth, and execution time — cancel execution if it exceeds preset limits
Additional Features
- Collect stdout and stderr and return it to the caller
- Run async or sync code on the host via async or sync code on the host
- Use a small subset of the standard library:
sys,os,typing,asyncio,re,datetime(soon),dataclasses(soon),json(soon)
What Monty Cannot Do
Monty currently does not support:- The rest of the Python standard library
- Third-party libraries (like Pydantic) — external library support is not a goal
- Class definitions (support coming soon)
- Match statements (support coming soon)
Use Cases
Monty is designed for agent applications that need to execute LLM-generated code safely:Programmatic Tool Calling
Replace sequential tool calls with Python code that calls your tools as functions
Agent Code Execution
Allow agents to write and execute Python code for complex multi-step tasks
Safe Sandboxing
Run untrusted code without containers or risk of host access
PydanticAI Integration
Power code-mode in Pydantic AI for faster, more reliable agent workflows
Inspiration
For motivation on why you might want to do this, see:- Codemode from Cloudflare
- Programmatic Tool Calling from Anthropic
- Code Execution with MCP from Anthropic
- Smol Agents from Hugging Face
Monty will soon be used to implement
codemode in Pydantic AI.Performance Comparison
| Technology | Language Completeness | Security | Start Latency | Setup Complexity |
|---|---|---|---|---|
| Monty | Partial | Strict | 0.06ms | Easy |
| Docker | Full | Good | 195ms | Intermediate |
| Pyodide | Full | Poor | 2800ms | Intermediate |
| WASI/Wasmer | Partial | Strict | 66ms | Intermediate |
| Sandboxing Service | Full | Strict | 1033ms | Intermediate |
Next Steps
Installation
Install Monty for Python, JavaScript, or Rust
Quickstart
Get started with your first Monty program
