Skip to main content
Sandboxes in OpenSandbox follow a well-defined lifecycle with distinct states and transitions. Understanding this lifecycle is essential for building robust applications.

Lifecycle states

A sandbox progresses through the following states:
1

Pending

Initial state when a sandbox is created. The runtime is pulling the image, injecting execd, and starting the container.
2

Running

Sandbox is fully operational and ready to accept commands, execute code, and perform file operations.
3

Pausing

Transitional state when pause() is called. The sandbox is stopping execution.
4

Paused

Sandbox execution is suspended. No commands or code can execute, but the state is preserved.
5

Stopping

Transitional state when kill() or delete() is called. The sandbox is being terminated.
6

Terminated

Final state. The sandbox is stopped and resources are released. This state is permanent.
7

Failed

Error state. The sandbox encountered an unrecoverable error during provisioning or execution.

State transitions

Creating sandboxes

Sandboxes are created with the Sandbox.create() method:
from opensandbox import Sandbox
from datetime import timedelta

sandbox = await Sandbox.create(
    "python:3.11",
    timeout=timedelta(minutes=30),
    env={"PYTHON_VERSION": "3.11"},
    entrypoint=["/bin/bash"],
    resource_limits={"cpu": "2", "memory": "4Gi"}
)

Required parameters

image
string
required
Docker image to use for the sandbox. Can be from Docker Hub or a private registry.
timeout
duration
required
Sandbox lifetime (60 seconds to 24 hours). After this time, the sandbox is automatically terminated.
entrypoint
string[]
required
Command to run as the main process. This keeps the sandbox running.

Optional parameters

env
object
Environment variables to set in the sandbox.
resourceLimits
object
CPU, memory, and GPU limits (e.g., {"cpu": "500m", "memory": "512Mi"}).
metadata
object
Custom metadata for tracking and filtering sandboxes.
networkPolicy
object
Egress network policy rules (allow/deny domains).

Managing sandbox lifetime

Automatic expiration

All sandboxes have a TTL (time-to-live) specified at creation. When the timeout expires, the sandbox is automatically terminated.
The minimum timeout is 60 seconds, and the maximum is 86400 seconds (24 hours). Choose a timeout based on your workload duration.

Renewing expiration

You can extend the sandbox lifetime before it expires:
# Extend by 10 more minutes
await sandbox.renew_expiration(timedelta(minutes=10))
Call renew_expiration() before the current timeout expires to keep long-running workloads alive.

Manual termination

To immediately terminate a sandbox:
await sandbox.kill()
This transitions the sandbox to the Stopping state and then to Terminated.

Pausing and resuming

You can temporarily pause a sandbox to save resources:
# Pause execution
await sandbox.pause()

# Resume execution
await sandbox.resume()
Note: Pause/resume is currently supported in Docker runtime with host networking mode. Kubernetes runtime support is planned.

Monitoring sandbox state

Check the current state of a sandbox:
info = await sandbox.get_info()
print(f"State: {info.state}")
print(f"Created: {info.created_at}")
print(f"Expires: {info.expires_at}")

Polling for readiness

When you create a sandbox, it starts in the Pending state. The SDK automatically polls until it reaches Running:
# The create() method waits until the sandbox is Running
sandbox = await Sandbox.create("ubuntu:22.04")

# At this point, the sandbox is ready to use
await sandbox.commands.run("echo 'Hello'")

Custom health checks

For sandboxes running services, you can implement custom health checks:
async def check_web_service():
    endpoint = await sandbox.get_endpoint(8000)
    response = await httpx.get(f"{endpoint}/health")
    return response.status_code == 200

sandbox = await Sandbox.create(
    "my-web-service:latest",
    health_check=check_web_service,
    ready_timeout=timedelta(seconds=30)
)

Error handling

Sandboxes can enter the Failed state due to:
  • Image pull failures (invalid image, authentication issues)
  • Resource limit exceeded during startup
  • Entrypoint command errors
  • Network policy violations
Handle errors with try-catch:
from opensandbox.exceptions import SandboxException

try:
    sandbox = await Sandbox.create("invalid-image:latest")
except SandboxException as e:
    print(f"Error: {e.error.code} - {e.error.message}")

Best practices

Set timeouts based on expected workload duration:
  • Short tasks (< 5 min): 5-10 minutes
  • Interactive sessions: 30-60 minutes
  • Long-running jobs: 2-24 hours
Use renew_expiration() for unpredictable durations.
Use context managers (Python) or try-finally blocks to ensure sandboxes are terminated:
async with await Sandbox.create("ubuntu") as sandbox:
    # Use sandbox
    pass
# Sandbox is automatically killed
For long-running workloads, periodically check sandbox state:
info = await sandbox.get_info()
if info.state == "Failed":
    # Handle failure
    pass
Image pulls can fail due to network issues. Implement retry logic:
for attempt in range(3):
    try:
        sandbox = await Sandbox.create("python:3.11")
        break
    except SandboxException as e:
        if attempt == 2:
            raise
        await asyncio.sleep(5)

Execution API

Learn how to execute commands and code in running sandboxes

API Reference

View the complete Lifecycle API reference

Resource management

Optimize resource usage and limits

Troubleshooting

Diagnose and fix common sandbox issues

Build docs developers (and LLMs) love