Overview
Volumes provide persistent, reusable storage that can be attached to sandboxes. Unlike sandbox filesystems that are ephemeral, volumes persist independently and can be mounted across multiple sandboxes, making them ideal for storing data, code, and configurations that need to outlive individual sandbox instances.
What is a Volume?
A Daytona Volume is:
Persistent Storage : Data persists independently of sandbox lifecycle
Reusable : Can be attached to multiple sandboxes (one at a time)
S3-Backed : Stored in S3-compatible object storage for durability
Mountable : Attached to sandboxes at specific mount paths
Subpath Support : Mount only specific portions of a volume
Volumes are backed by S3-compatible storage and mounted into sandbox containers, providing both durability and performance.
Volume Structure
Every volume contains:
{
id : string , // Unique identifier
name : string , // Human-readable name
organizationId : string , // Organization owner
state : VolumeState , // Current state
createdAt : string , // Creation timestamp
updatedAt : string , // Last update timestamp
lastUsedAt : string , // Last mount timestamp
errorReason : string // Error details if applicable
}
Volume States
Volumes transition through the following states:
State Description pending_createVolume creation requested, waiting to start creatingVolume is being created in storage readyVolume is ready to be mounted pending_deleteVolume deletion requested, waiting to start deletingVolume is being deleted deletedVolume has been removed errorVolume encountered an error
Creating Volumes
Basic Volume Creation
from daytona import Daytona, DaytonaConfig
daytona = Daytona(DaytonaConfig( api_key = "YOUR_API_KEY" ))
# Create a new volume
volume = daytona.volumes.create( name = "project-data" )
print ( f "Volume created: { volume.id } " )
print ( f "State: { volume.state } " )
import { Daytona } from '@daytonaio/sdk' ;
const daytona = new Daytona ({ apiKey: 'YOUR_API_KEY' });
// Create a new volume
const volume = await daytona . volumes . create ({
name: 'project-data'
});
console . log ( `Volume created: ${ volume . id } ` );
console . log ( `State: ${ volume . state } ` );
import (
" context "
" github.com/daytonaio/daytona/libs/sdk-go/pkg/daytona "
)
client , _ := daytona . NewClient ()
ctx := context . Background ()
// Create a new volume
volume , _ := client . Volumes . Create ( ctx , types . CreateVolumeParams {
Name : "project-data" ,
})
fmt . Printf ( "Volume created: %s \n " , volume . Id )
Creating with Sandbox
Create a volume and attach it to a sandbox in one operation:
# Create sandbox with new volume
sandbox = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{
"name" : "workspace-data" , # Creates new volume
"mount_path" : "/workspace"
}]
))
Mounting Volumes
Mount to Sandbox
Attach existing volumes to sandboxes:
# Create volume
volume = daytona.volumes.create( name = "shared-data" )
# Create sandbox with mounted volume
sandbox = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{
"volume_id" : volume.id,
"mount_path" : "/data"
}]
))
# Volume is now accessible at /data in the sandbox
sandbox.process.execute_command( "ls -la /data" )
Multiple Mount Paths
Mount different volumes at different paths:
# Create multiple volumes
code_volume = daytona.volumes.create( name = "code" )
data_volume = daytona.volumes.create( name = "data" )
config_volume = daytona.volumes.create( name = "config" )
# Mount all volumes
sandbox = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [
{ "volume_id" : code_volume.id, "mount_path" : "/workspace" },
{ "volume_id" : data_volume.id, "mount_path" : "/data" },
{ "volume_id" : config_volume.id, "mount_path" : "/config" }
]
))
Subpath Mounting
Mount only specific subdirectories within a volume:
# Create volume with multiple projects
volume = daytona.volumes.create( name = "multi-project" )
# Sandbox 1: Mount only project-a
sandbox_a = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{
"volume_id" : volume.id,
"mount_path" : "/workspace" ,
"subpath" : "project-a" # Only mounts /project-a from volume
}]
))
# Sandbox 2: Mount only project-b
sandbox_b = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{
"volume_id" : volume.id,
"mount_path" : "/workspace" ,
"subpath" : "project-b" # Only mounts /project-b from volume
}]
))
Subpath mounting is useful for isolating different projects or datasets within the same volume while maintaining a single storage resource.
Volume Management
Listing Volumes
# List all volumes
volumes = daytona.volumes.list()
for volume in volumes:
print ( f "Name: { volume.name } " )
print ( f "ID: { volume.id } " )
print ( f "State: { volume.state } " )
print ( f "Last used: { volume.last_used_at } " )
print ( "---" )
Getting Volume Details
# Get specific volume
volume = daytona.volumes.get( volume_id = "vol-123" )
print ( f "Name: { volume.name } " )
print ( f "State: { volume.state } " )
print ( f "Created: { volume.created_at } " )
print ( f "Last used: { volume.last_used_at } " )
Deleting Volumes
# Delete a volume
daytona.volumes.delete( volume_id = "vol-123" )
# Delete by name
volume = daytona.volumes.get( name = "old-data" )
daytona.volumes.delete(volume.id)
Deleting a volume permanently removes all data stored in it. Ensure you have backups before deletion. Volumes cannot be deleted while mounted to a running sandbox.
Data Persistence Patterns
Workspace Persistence
Maintain project state across sandbox sessions:
# Create workspace volume
workspace = daytona.volumes.create( name = "my-project" )
# Day 1: Initialize project
sandbox1 = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : workspace.id, "mount_path" : "/workspace" }]
))
sandbox1.process.execute_command( "git clone https://github.com/user/repo /workspace" )
sandbox1.stop()
# Day 2: Continue working (data persists)
sandbox2 = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : workspace.id, "mount_path" : "/workspace" }]
))
# /workspace still contains the cloned repo
sandbox2.process.execute_command( "cd /workspace && git pull" )
Data Pipeline Storage
Store intermediate results between pipeline stages:
# Create pipeline data volume
pipeline_data = daytona.volumes.create( name = "pipeline-storage" )
# Stage 1: Data collection
stage1 = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : pipeline_data.id, "mount_path" : "/data" }]
))
stage1.process.code_run( """
import pandas as pd
df = pd.read_csv('source.csv')
df.to_parquet('/data/processed.parquet')
""" )
stage1.delete()
# Stage 2: Analysis (reads from same volume)
stage2 = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : pipeline_data.id, "mount_path" : "/data" }]
))
stage2.process.code_run( """
import pandas as pd
df = pd.read_parquet('/data/processed.parquet')
results = df.describe()
results.to_csv('/data/summary.csv')
""" )
Configuration Management
Share configuration across multiple sandboxes:
# Create configuration volume
config_vol = daytona.volumes.create( name = "shared-config" )
# Initialize configuration in one sandbox
setup = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : config_vol.id, "mount_path" : "/config" }]
))
setup.fs.write_file( "/config/app.yaml" , config_content)
setup.fs.write_file( "/config/secrets.env" , secrets_content)
setup.delete()
# Use configuration in multiple sandboxes
for i in range ( 5 ):
worker = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : config_vol.id, "mount_path" : "/config" }]
))
# All workers have access to the same configuration
worker.process.execute_command( "cat /config/app.yaml" )
Use Cases
Machine Learning Datasets
# Create volume for ML dataset
dataset_volume = daytona.volumes.create( name = "ml-dataset" )
# Load data once
data_loader = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : dataset_volume.id, "mount_path" : "/data" }]
))
data_loader.process.code_run( """
import pandas as pd
from sklearn.datasets import load_iris
data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df.to_parquet('/data/iris.parquet')
""" )
# Multiple training sandboxes share the same data
for model_type in [ 'svm' , 'random_forest' , 'neural_net' ]:
trainer = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : dataset_volume.id, "mount_path" : "/data" }],
gpu = 1.0
))
trainer.process.code_run( f "python train_ { model_type } .py --data /data/iris.parquet" )
Multi-Tenant Applications
# Create per-tenant volumes
tenant_volumes = {}
for tenant_id in [ 'tenant-a' , 'tenant-b' , 'tenant-c' ]:
tenant_volumes[tenant_id] = daytona.volumes.create(
name = f "tenant- { tenant_id } -data"
)
# Create isolated sandbox for each tenant
def create_tenant_sandbox ( tenant_id ):
return daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{
"volume_id" : tenant_volumes[tenant_id].id,
"mount_path" : "/tenant-data"
}],
labels = { "tenant" : tenant_id}
))
# Each tenant's data is isolated
tenant_a_sandbox = create_tenant_sandbox( 'tenant-a' )
tenant_b_sandbox = create_tenant_sandbox( 'tenant-b' )
Code Repository Cache
# Create cache volume for git repositories
repo_cache = daytona.volumes.create( name = "git-cache" )
# First sandbox clones repos
sandbox1 = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : repo_cache.id, "mount_path" : "/repos" }]
))
sandbox1.process.execute_command( """
git clone https://github.com/large/repo1 /repos/repo1
git clone https://github.com/large/repo2 /repos/repo2
""" )
# Subsequent sandboxes use cached repos (faster)
sandbox2 = daytona.create(CreateSandboxParams(
language = "python" ,
volumes = [{ "volume_id" : repo_cache.id, "mount_path" : "/repos" }]
))
sandbox2.process.execute_command( "cd /repos/repo1 && git pull" ) # Fast update
Best Practices
Naming Convention : Use descriptive names that indicate the volume’s purpose
Organization : Create separate volumes for different types of data (code, data, config)
Cleanup : Regularly delete unused volumes to reduce storage costs
Subpaths : Use subpath mounting for multi-tenant or multi-project volumes
Backup : Important data should be backed up outside of volumes
Access Patterns : Mount volumes read-only when data should not be modified
Size Management : Monitor volume usage and clean up old data regularly
Volume vs Sandbox Filesystem
Feature Volume Sandbox Filesystem Persistence Survives sandbox deletion Lost when sandbox deleted Sharing Can be mounted to multiple sandboxes Isolated per sandbox Performance S3-backed (network) Container filesystem (fast) Use Case Long-term storage, shared data Temporary work, cached data Cost Charged per GB stored Included with sandbox
Use volumes for data that needs to persist or be shared. Use the sandbox filesystem for temporary computation and caching.
Volume Access Speed
Volumes are backed by S3, which has different performance characteristics than local disk:
# For best performance, copy frequently accessed data to sandbox filesystem
sandbox.process.execute_command( """
# Copy from volume to local for faster access
cp -r /volume-mount/data /tmp/local-data
# Process from local copy
python process.py --input /tmp/local-data
# Write results back to volume
cp -r /tmp/results /volume-mount/results
""" )
Concurrent Access
Volumes can only be mounted to one sandbox at a time:
# This works - sequential access
sandbox1 = daytona.create(CreateSandboxParams(
volumes = [{ "volume_id" : vol.id, "mount_path" : "/data" }]
))
sandbox1.process.execute_command( "process /data" )
sandbox1.delete()
# Now another sandbox can mount it
sandbox2 = daytona.create(CreateSandboxParams(
volumes = [{ "volume_id" : vol.id, "mount_path" : "/data" }]
))
Next Steps
Sandboxes Learn about creating and managing sandboxes
Snapshots Understand pre-built environments
File Operations Work with files in sandboxes and volumes
Getting Started Start building with volumes