Serialization

Overview

Monty supports serialization of both parsed code and execution state, enabling:

Code caching: Parse once, serialize, reuse later
Execution suspension: Pause execution, save state, resume later
Process migration: Move execution between processes or machines
Persistent workflows: Store long-running computations in databases

Serialization uses the postcard format - a compact, binary encoding designed for embedded systems.

Serializing Monty Instances

Caching Parsed Code

Parsing and compiling Python code has overhead. Serialize a Monty instance to avoid re-parsing:

import pydantic_monty

# Parse and compile once
m = pydantic_monty.Monty('x ** 2 + y', inputs=['x', 'y'])

# Serialize to bytes
data: bytes = m.dump()
print(f"Serialized size: {len(data)} bytes")

# Save to file/database/cache
with open('cached_code.monty', 'wb') as f:
    f.write(data)

# Later, restore from bytes
with open('cached_code.monty', 'rb') as f:
    data = f.read()

m2 = pydantic_monty.Monty.load(data)

# Run immediately without re-parsing
result = m2.run(inputs={'x': 5, 'y': 3})
print(result)  # 28

What Gets Serialized

When you serialize a Monty instance, it includes:

Compiled bytecode for all functions
Interned strings (variable names, constants)
Namespace size and structure
Type checking information (if enabled)

The original source code is included in serialized data for error reporting. If your code contains sensitive information, encrypt the serialized bytes.

Serializing Execution State

When using iterative execution, you can serialize snapshots at any suspension point.

Basic Snapshot Serialization

import pydantic_monty

code = """
data = fetch(url)
processed = transform(data)
len(processed)
"""

m = pydantic_monty.Monty(code, inputs=['url'])

# Start execution
progress = m.start(inputs={'url': 'https://example.com'})

# Execution paused at fetch() call
if isinstance(progress, pydantic_monty.FunctionSnapshot):
    # Serialize the execution state
    snapshot_data = progress.dump()
    
    # Save to database, send over network, etc.
    save_to_database(snapshot_data)

# Later, in a different process...
snapshot_data = load_from_database()

# Restore execution state
progress = pydantic_monty.FunctionSnapshot.load(snapshot_data)

# Continue execution
result = progress.resume(return_value='fetched data')

What Gets Serialized in Snapshots

Execution snapshots include:

VM State: Instruction pointer, call stack, exception handlers
Heap: All allocated objects (strings, lists, dicts, etc.)
Namespaces: All variable bindings (global and local)
Resource Tracker: Allocation counts, memory usage, time limits
Compiled Code: Bytecode and interns (same as Monty instance)

Time limits are reset when deserializing execution state. The timer starts from zero after calling load().

Snapshot Types

All snapshot types support serialization:

FunctionSnapshot

Paused at external function call:

progress = m.start(inputs={'x': 42})
if isinstance(progress, pydantic_monty.FunctionSnapshot):
    data = progress.dump()
    # ... later ...
    restored = pydantic_monty.FunctionSnapshot.load(data)
    result = restored.resume(return_value=100)

OsSnapshot

Paused at OS operation:

if isinstance(progress, pydantic_monty.OsSnapshot):
    data = progress.dump()
    # ... later ...
    restored = pydantic_monty.OsSnapshot.load(data)
    result = restored.resume(return_value='/path/exists')

NameLookupSnapshot

Paused at name resolution:

if isinstance(progress, pydantic_monty.NameLookupSnapshot):
    data = progress.dump()
    # ... later ...
    restored = pydantic_monty.NameLookupSnapshot.load(data)
    result = restored.resume(value=some_function)

Use Cases

1. Distributed Execution

Execute expensive computations across multiple workers:

import pydantic_monty
import redis

code = """
result = expensive_computation(data)
final = another_computation(result)
final
"""

def worker_1():
    m = pydantic_monty.Monty(code, inputs=['data'])
    progress = m.start(inputs={'data': [1, 2, 3]})
    
    # Save to Redis
    redis_client.set('task:123', progress.dump())

def worker_2():
    # Different process/machine
    data = redis_client.get('task:123')
    progress = pydantic_monty.FunctionSnapshot.load(data)
    
    # Continue execution
    result = progress.resume(return_value=42)

2. Long-Running Workflows

Persist execution state for workflows that take hours or days:

import pydantic_monty
import sqlite3

def save_workflow_state(workflow_id: str, progress):
    conn = sqlite3.connect('workflows.db')
    conn.execute(
        'UPDATE workflows SET state = ? WHERE id = ?',
        (progress.dump(), workflow_id)
    )
    conn.commit()

def resume_workflow(workflow_id: str):
    conn = sqlite3.connect('workflows.db')
    row = conn.execute(
        'SELECT state FROM workflows WHERE id = ?',
        (workflow_id,)
    ).fetchone()
    
    progress = pydantic_monty.FunctionSnapshot.load(row[0])
    # Continue execution
    return progress.resume(return_value=get_data())

3. Interactive Debugging

Pause execution, inspect state, then continue:

progress = m.start(inputs={'x': 5})

# Save snapshot
snapshot = progress.dump()

# Try different return values
for test_value in [10, 20, 30]:
    # Restore from same snapshot each time
    p = pydantic_monty.FunctionSnapshot.load(snapshot)
    result = p.resume(return_value=test_value)
    print(f"With {test_value}: {result}")

4. Code Template Caching

Pre-parse code templates and cache them:

import pydantic_monty
from functools import lru_cache

@lru_cache(maxsize=100)
def get_cached_monty(code_template: str) -> bytes:
    m = pydantic_monty.Monty(code_template, inputs=['data'])
    return m.dump()

def execute_template(code_template: str, data):
    cached_bytes = get_cached_monty(code_template)
    m = pydantic_monty.Monty.load(cached_bytes)
    return m.run(inputs={'data': data})

Security Considerations

Critical: Only deserialize data from trusted sources.Deserializing malicious snapshot data can:

Restore arbitrary execution state
Execute malicious code when resumed
Bypass resource limits
Access unintended memory

Safe Deserialization

import hmac
import hashlib

SECRET_KEY = b'your-secret-key'

def secure_dump(progress) -> bytes:
    data = progress.dump()
    signature = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
    return signature + data

def secure_load(signed_data: bytes):
    signature = signed_data[:32]
    data = signed_data[32:]
    
    expected = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
    if not hmac.compare_digest(signature, expected):
        raise ValueError('Invalid signature - data may be tampered')
    
    return pydantic_monty.FunctionSnapshot.load(data)

Serialization Format

Monty uses postcard - a compact, deterministic binary format:

No schema evolution: Deserializing with a different Monty version may fail
Compact: Typically 10-50% smaller than JSON
Fast: Zero-copy deserialization where possible
Deterministic: Same data always produces same bytes

For version-stable persistence, consider wrapping serialized data in a versioned container format.

Performance

Serialization Speed

Monty instance: ~0.1ms for typical code
Execution snapshot: ~0.5-5ms depending on heap size

Size Examples

Content	Approximate Size
Empty Monty instance	~100 bytes
Small code (10 lines)	~500 bytes
Large code (1000 lines)	~50KB
Snapshot with small heap	~1KB
Snapshot with 1000 objects	~50KB

Best Practices

Cache Parsed Code

Always serialize and cache Monty instances when executing the same code multiple times.

Validate Signatures

Sign serialized data with HMAC before saving to untrusted storage.

Version Your Data

Wrap serialized bytes in a versioned container to handle Monty version upgrades.

Set Expiration

Set TTL on cached snapshots to prevent unbounded storage growth.

Compress for Network

Use gzip/zstd compression when sending snapshots over the network.

Next Steps

Execution Modes

Learn about run() vs start()/resume() execution

Resource Limits

Configure memory, time, and recursion limits

Get Started

Core Concepts

Guides

Language Support

Serialization

Overview

Serializing Monty Instances

Caching Parsed Code

What Gets Serialized

Serializing Execution State

Basic Snapshot Serialization

What Gets Serialized in Snapshots

Snapshot Types

FunctionSnapshot

OsSnapshot

NameLookupSnapshot

Use Cases

1. Distributed Execution

2. Long-Running Workflows

3. Interactive Debugging

4. Code Template Caching

Security Considerations

Safe Deserialization

Serialization Format

Performance

Serialization Speed

Size Examples

Best Practices

Next Steps

Execution Modes

Resource Limits

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Language Support

​Overview

​Serializing Monty Instances

​Caching Parsed Code

​What Gets Serialized

​Serializing Execution State

​Basic Snapshot Serialization

​What Gets Serialized in Snapshots

​Snapshot Types

​FunctionSnapshot

​OsSnapshot

​NameLookupSnapshot

​Use Cases

​1. Distributed Execution

​2. Long-Running Workflows

​3. Interactive Debugging

​4. Code Template Caching

​Security Considerations

​Safe Deserialization

​Serialization Format

​Performance

​Serialization Speed

​Size Examples

​Best Practices

​Next Steps

Execution Modes

Resource Limits

Build docs developers (and LLMs) love

Overview

Serializing Monty Instances

Caching Parsed Code

What Gets Serialized

Serializing Execution State

Basic Snapshot Serialization

What Gets Serialized in Snapshots

Snapshot Types

FunctionSnapshot

OsSnapshot

NameLookupSnapshot

Use Cases

1. Distributed Execution

2. Long-Running Workflows

3. Interactive Debugging

4. Code Template Caching

Security Considerations

Safe Deserialization

Serialization Format

Performance

Serialization Speed

Size Examples

Best Practices

Next Steps