Cls adds method pooling and lifecycle hook behavior to modal.Function.
Generally, you will not construct a Cls directly. Instead, use the @app.cls() decorator on the App object.
Decorator
@app.cls
@app.cls(
image: Optional[Image] = None,
env: Optional[dict[str, Optional[str]]] = None,
secrets: Optional[Collection[Secret]] = None,
gpu: GPU_T = None,
network_file_systems: dict[Union[str, PurePosixPath], NetworkFileSystem] = {},
volumes: dict[Union[str, PurePosixPath], Union[Volume, CloudBucketMount]] = {},
cpu: Optional[Union[float, tuple[float, float]]] = None,
memory: Optional[Union[int, tuple[int, int]]] = None,
ephemeral_disk: Optional[int] = None,
min_containers: Optional[int] = None,
max_containers: Optional[int] = None,
buffer_containers: Optional[int] = None,
scaledown_window: Optional[int] = None,
proxy: Optional[Proxy] = None,
retries: Optional[Union[int, Retries]] = None,
timeout: int = 300,
startup_timeout: Optional[int] = None,
cloud: Optional[str] = None,
region: Optional[Union[str, Sequence[str]]] = None,
nonpreemptible: bool = False,
enable_memory_snapshot: bool = False,
block_network: bool = False,
restrict_modal_access: bool = False,
single_use_containers: bool = False,
include_source: Optional[bool] = None,
)
Decorator to register a new Modal Cls with this App.
Accepts the same parameters as @app.function() except is_generator, schedule, and name.
Usage
Basic class
@app.cls()
class MyModel:
@modal.method()
def predict(self, x):
return x * 2
# Call the method
result = MyModel().predict.remote(21) # Returns 42
Class with parameters
Use modal.parameter() to specify class parameters:
@app.cls()
class MyModel:
model_name: str = modal.parameter()
@modal.enter()
def load_model(self):
# Load model using self.model_name
pass
@modal.method()
def predict(self, x):
return self.model.predict(x)
# Create an instance with parameters
model = MyModel(model_name="gpt-4")
result = model.predict.remote(data)
Lifecycle hooks
@app.cls()
class MyService:
@modal.enter()
def startup(self):
# Called when container starts
print("Service starting up")
@modal.exit()
def shutdown(self):
# Called when container shuts down
print("Service shutting down")
@modal.method()
def process(self, data):
return data
Methods
Cls.from_name
modal.Cls.from_name(
app_name: str,
name: str,
*,
environment_name: Optional[str] = None,
client: Optional[Client] = None,
) -> Cls
Reference a Cls from a deployed App by its name.
This is a lazy method that defers hydrating the local object until it is actually used.
Name of the deployed App.
Name of the Cls within the App.
Environment to look up the Cls in.
Optional Modal client to use.
The referenced Cls instance.
Example:
Model = modal.Cls.from_name("my-app", "Model")
model = Model()
result = model.predict.remote(data)
cls.with_options
cls.with_options(
*,
cpu: Optional[Union[float, tuple[float, float]]] = None,
memory: Optional[Union[int, tuple[int, int]]] = None,
gpu: GPU_T = None,
env: Optional[dict[str, Optional[str]]] = None,
secrets: Optional[Collection[Secret]] = None,
volumes: dict[Union[str, PurePosixPath], Union[Volume, CloudBucketMount]] = {},
retries: Optional[Union[int, Retries]] = None,
max_containers: Optional[int] = None,
buffer_containers: Optional[int] = None,
scaledown_window: Optional[int] = None,
timeout: Optional[int] = None,
region: Optional[Union[str, Sequence[str]]] = None,
cloud: Optional[str] = None,
) -> Cls
Override the static Cls configuration at runtime.
This returns a new Cls instance that will autoscale independently of the original. Options cannot be “unset” with this method.
Example:
Model = modal.Cls.from_name("my-app", "Model")
ModelGPU = Model.with_options(gpu="A100")
model = ModelGPU()
result = model.predict.remote(data) # Runs with A100 GPU
Options can be stacked:
Model.with_options(gpu="A100").with_options(scaledown_window=300)
cls.with_concurrency
cls.with_concurrency(
*,
max_inputs: int,
target_inputs: Optional[int] = None,
) -> Cls
Create an instance of the Cls with input concurrency enabled or overridden.
Maximum number of concurrent inputs per container.
Target number of concurrent inputs for autoscaling purposes.
Example:
Model = modal.Cls.from_name("my-app", "Model")
ModelConcurrent = Model.with_options(gpu="A100").with_concurrency(max_inputs=100)
model = ModelConcurrent()
result = model.predict.remote(42)
cls.with_batching
cls.with_batching(
*,
max_batch_size: int,
wait_ms: int,
) -> Cls
Create an instance of the Cls with dynamic batching enabled or overridden.
Maximum time to wait for a batch in milliseconds.
Example:
Model = modal.Cls.from_name("my-app", "Model")
ModelBatched = Model.with_options(gpu="A100").with_batching(
max_batch_size=100,
wait_ms=1000
)
model = ModelBatched()
result = model.predict.remote(42)
Instance methods
Once you create an instance of a Cls, you can call its methods:
instance.update_autoscaler
instance.update_autoscaler(
*,
min_containers: Optional[int] = None,
max_containers: Optional[int] = None,
scaledown_window: Optional[int] = None,
buffer_containers: Optional[int] = None,
) -> None
Override the current autoscaler behavior for this Cls instance.
Minimum number of containers to keep running.
Maximum number of containers allowed.
Time in seconds before scaling down idle containers.
Number of additional idle containers to maintain under active load.
Example:
Model = modal.Cls.from_name("my-app", "Model")
model = Model()
model.update_autoscaler(min_containers=2, buffer_containers=1)
Helper functions
modal.parameter
modal.parameter(
*,
default: Any = ...,
init: bool = True,
) -> Any
Used to specify options for Cls parameters, similar to dataclass.field for dataclasses.
Default value for the parameter.
If False, the field is not considered a parameter and not used in the constructor.
Example:
@app.cls()
class MyModel:
model_name: str = modal.parameter()
max_tokens: int = modal.parameter(default=100)
internal_state: Any = modal.parameter(init=False) # Not a constructor parameter