Introduction

Experimental - This project is still in development and not ready for prime time.

Monty is a minimal, secure Python interpreter written in Rust specifically designed for running LLM-generated code safely within your agent applications.

Why Monty?

Monty avoids the cost, latency, complexity, and general faff of using a full container-based sandbox for running LLM-generated code. Instead, it lets you safely run Python code written by an LLM embedded in your agent, with startup times measured in single digit microseconds — not hundreds of milliseconds.

The Problem

Large Language Models can work faster, cheaper, and more reliably when they write Python (or JavaScript) code instead of relying on traditional tool calling. However, running untrusted LLM-generated code presents significant security challenges:

Containers are slow: Docker startup takes ~195ms, far too slow for real-time agent interactions
Direct execution is dangerous: Running code via exec() or subprocess gives full access to filesystem, network, and system resources
WASM solutions are complex: Options like Pyodide have slow cold starts (~2.8s) and complex setup requirements

Monty solves this by providing a secure, fast, embedded Python interpreter purpose-built for agent code execution.

What Monty Can Do

Run Python Subset

Execute a reasonable subset of Python code — enough for your agent to express what it wants to do

Secure by Default

Completely block access to the host environment: filesystem, env variables, and network access are all implemented via external function calls you control

External Functions

Call functions on the host — only functions you explicitly give it access to

Type Checking

Full modern Python type hints support with ty included in a single binary

Snapshotting

Snapshot to bytes at external function calls, store interpreter state in a file or database, and resume later

Fast Startup

Startup extremely fast (<1μs from code to execution result), with runtime performance similar to CPython (generally between 5x faster and 5x slower)

Multi-Language Support

Call from Rust, Python, or JavaScript — Monty has no dependencies on CPython, so you can use it anywhere you can run Rust

Resource Control

Track memory usage, allocations, stack depth, and execution time — cancel execution if it exceeds preset limits

Additional Features

Collect stdout and stderr and return it to the caller
Run async or sync code on the host via async or sync code on the host
Use a small subset of the standard library: sys, os, typing, asyncio, re, datetime (soon), dataclasses (soon), json (soon)

What Monty Cannot Do

Monty is extremely limited and designed for one use case: to run code written by agents.

Monty currently does not support:

The rest of the Python standard library
Third-party libraries (like Pydantic) — external library support is not a goal
Class definitions (support coming soon)
Match statements (support coming soon)

Use Cases

Monty is designed for agent applications that need to execute LLM-generated code safely:

Programmatic Tool Calling

Replace sequential tool calls with Python code that calls your tools as functions

Agent Code Execution

Allow agents to write and execute Python code for complex multi-step tasks

Safe Sandboxing

Run untrusted code without containers or risk of host access

PydanticAI Integration

Power code-mode in Pydantic AI for faster, more reliable agent workflows

Inspiration

For motivation on why you might want to do this, see:

Codemode from Cloudflare
Programmatic Tool Calling from Anthropic
Code Execution with MCP from Anthropic
Smol Agents from Hugging Face

Monty will soon be used to implement codemode in Pydantic AI.

Performance Comparison

Technology	Language Completeness	Security	Start Latency	Setup Complexity
Monty	Partial	Strict	0.06ms	Easy
Docker	Full	Good	195ms	Intermediate
Pyodide	Full	Poor	2800ms	Intermediate
WASI/Wasmer	Partial	Strict	66ms	Intermediate
Sandboxing Service	Full	Strict	1033ms	Intermediate

Monty offers the fastest startup time while maintaining strict security guarantees.

Get Started

Core Concepts

Guides

Language Support

Introduction

Why Monty?

The Problem

What Monty Can Do

Run Python Subset

Secure by Default

External Functions

Type Checking

Snapshotting

Fast Startup

Multi-Language Support

Resource Control

Additional Features

What Monty Cannot Do

Use Cases

Programmatic Tool Calling

Agent Code Execution

Safe Sandboxing

PydanticAI Integration

Inspiration

Performance Comparison

Next Steps

Installation

Quickstart

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Language Support

​Why Monty?

​The Problem

​What Monty Can Do

Run Python Subset

Secure by Default

External Functions

Type Checking

Snapshotting

Fast Startup

Multi-Language Support

Resource Control

​Additional Features

​What Monty Cannot Do

​Use Cases

Programmatic Tool Calling

Agent Code Execution

Safe Sandboxing

PydanticAI Integration

​Inspiration

​Performance Comparison

​Next Steps

Installation

Quickstart

Build docs developers (and LLMs) love

Why Monty?

The Problem

What Monty Can Do

Additional Features

What Monty Cannot Do

Use Cases

Inspiration

Performance Comparison

Next Steps