Functions are the basic units of serverless execution on Modal. They let you run Python code in the cloud with automatic scaling, custom dependencies, and hardware acceleration.
Creating functions
Create a Function by decorating a Python function with @app.function():
import modal
app = modal.App()
@app.function ()
def hello ( name : str ):
return f "Hello, { name } !"
Function parameters
The @app.function() decorator accepts many configuration parameters:
@app.function (
image = modal.Image.debian_slim().pip_install( "numpy" ),
secrets = [modal.Secret.from_name( "my-secret" )],
gpu = "A100" ,
cpu = 2.0 ,
memory = 4096 ,
timeout = 600
)
def compute ( data ):
import numpy as np
return np.sum(data)
Container Configuration
Resource Configuration
Scaling Configuration
@app.function (
image = modal.Image.debian_slim().pip_install( "requests" ),
secrets = [modal.Secret.from_name( "api-key" )],
volumes = { "/cache" : modal.Volume.from_name( "my-cache" )}
)
def fetch_data ():
pass
Executing functions
Remote execution
Call .remote() to execute a function in the cloud:
with app.run():
result = hello.remote( "World" )
print (result) # "Hello, World!"
The .remote() call serializes the arguments, sends them to Modal’s infrastructure, executes the function in a container, and returns the result.
Local execution
Call .local() to execute a function locally for testing:
result = hello.local( "World" )
print (result) # "Hello, World!"
Local execution runs the function in your current Python process without any containerization or cloud resources.
Map and spawn
Process multiple inputs in parallel using .map():
names = [ "Alice" , "Bob" , "Charlie" ]
with app.run():
results = list (hello.map(names))
print (results) # ["Hello, Alice!", "Hello, Bob!", "Hello, Charlie!"]
For fire-and-forget execution, use .spawn():
with app.run():
call = hello.spawn( "World" )
# Do other work...
result = call.get() # Wait for result when needed
Async functions
Modal supports both synchronous and asynchronous functions:
@app.function ()
async def async_hello ( name : str ):
await asyncio.sleep( 1 )
return f "Hello, { name } !"
Call async functions from async code:
async with app.run():
result = await async_hello.remote.aio( "World" )
Generator functions
Functions can yield results incrementally:
@app.function ()
def count_to ( n : int ):
for i in range (n):
yield i
Iterate over results:
with app.run():
for num in count_to.remote_gen( 10 ):
print (num)
Generator functions are useful for streaming data, processing large datasets in chunks, or providing progress updates.
Retries
Configure automatic retries for failed function calls:
@app.function ( retries = 3 )
def flaky_api_call ():
# This will retry up to 3 times on failure
pass
For more control, use the Retries class:
from modal import Retries
@app.function (
retries = Retries(
max_retries = 5 ,
initial_delay = 1.0 ,
backoff_coefficient = 2.0
)
)
def resilient_function ():
pass
Retries are not supported for generator functions or web endpoints.
Timeouts
Set execution timeouts to prevent functions from running indefinitely:
@app.function ( timeout = 300 ) # 5 minutes
def long_running_task ():
pass
You can also set a separate startup timeout:
@app.function (
timeout = 600 , # Execution timeout
startup_timeout = 120 # Startup timeout
)
def slow_startup_function ():
pass
Scheduling
Run functions on a schedule:
from modal import Period, Cron
@app.function ( schedule = Period( days = 1 ))
def daily_task ():
print ( "Running daily task" )
@app.function ( schedule = Cron( "0 */6 * * *" ))
def every_six_hours ():
print ( "Running every 6 hours" )
Scheduled functions must accept no arguments.
Web endpoints
Expose functions as HTTP endpoints:
@app.function ()
@modal.web_endpoint ()
def api ():
return { "message" : "Hello, API!" }
Access the endpoint URL after deployment:
with app.run():
print (api.web_url)
Learn more in the Web Endpoints guide .
Resource configuration
CPU and memory
Request specific CPU and memory resources:
@app.function (
cpu = 2.0 , # 2 CPU cores
memory = 8192 # 8 GB in MiB
)
def cpu_intensive ():
pass
Specify hard limits:
@app.function (
cpu = ( 2.0 , 4.0 ), # Request 2, limit to 4
memory = ( 4096 , 8192 ) # Request 4 GB, limit to 8 GB
)
def bounded_resources ():
pass
GPUs
Request GPU resources:
@app.function ( gpu = "A100" )
def gpu_task ():
pass
Specify multiple GPU options with fallback:
@app.function ( gpu = [ "H100" , "A100" , "T4" ])
def flexible_gpu_task ():
pass
Region selection
Run functions in specific regions:
@app.function ( region = "us-east-1" )
def east_coast_only ():
pass
@app.function ( region = [ "us-east-1" , "eu-west-1" ])
def multi_region ():
pass
Autoscaling
Control how your functions scale:
@app.function (
min_containers = 2 , # Always keep 2 running
max_containers = 10 , # Never exceed 10
buffer_containers = 1 , # Keep 1 extra idle container
scaledown_window = 300 # Wait 5 minutes before scaling down
)
def auto_scaled ():
pass
Allow containers to process multiple inputs simultaneously:
from modal import concurrent
@app.function ()
@concurrent ( max_inputs = 100 )
def concurrent_handler ( request ):
# Can process up to 100 requests per container
pass
Best practices
Keep functions focused
Create small, single-purpose functions rather than large monolithic ones:
# Good: Focused functions
@app.function ()
def extract_data ():
pass
@app.function ()
def transform_data ():
pass
# Avoid: Monolithic function doing everything
@app.function ()
def do_everything ():
pass
Use appropriate timeouts
Set realistic timeouts based on expected execution time:
@app.function ( timeout = 30 ) # Quick API calls
def api_call ():
pass
@app.function ( timeout = 3600 ) # Long-running batch jobs
def batch_process ():
pass
Leverage generator functions for streaming
Use generators for large datasets or streaming responses:
@app.function ()
def process_large_dataset ():
for chunk in dataset_chunks:
result = process(chunk)
yield result