Modal

Overview

Modal provides flexible trial credits:

5 per month upon sign up, which increases to

30 per month when you add a payment method. Modal is a serverless platform where you pay by compute time for any supported model.

Free Tier

$5/month (no payment method)

Enhanced Free Tier

$30/month (with payment method)

Pricing Model

Modal charges by compute time rather than tokens. You pay for the actual CPU/GPU time your code uses, making it cost-effective for batch processing and custom workloads.

Credits Structure

Tier	Monthly Credits	Requirements
Basic	$5/month	Just sign up
Enhanced	$30/month	Add payment method

Available Models

Modal supports any model you can deploy. Unlike traditional API providers, Modal lets you:

Deploy any open-source model from Hugging Face
Run custom inference code with your own optimizations
Use any framework: PyTorch, JAX, TensorFlow, etc.
Scale automatically based on demand

Modal is ideal for developers who want full control over their model deployment and inference pipeline.

Getting Started

Visit modal.com and create a free account to receive $5/month in credits.

2. Add Payment Method (Optional)

Add a payment method to increase your free monthly credits to $30.

pip install modal

4. Authenticate

modal token new

5. Deploy Your First Function

import modal

stub = modal.Stub("example-app")

@stub.function(
    gpu="A10G",
    image=modal.Image.debian_slim().pip_install(
        "transformers",
        "torch",
        "accelerate"
    )
)
def generate_text(prompt: str):
    from transformers import pipeline
    
    generator = pipeline(
        "text-generation",
        model="meta-llama/Llama-3.1-8B-Instruct",
        device="cuda"
    )
    
    return generator(prompt, max_length=100)[0]["generated_text"]

@stub.local_entrypoint()
def main():
    result = generate_text.remote("What is the capital of France?")
    print(result)

6. Run Your Function

modal run app.py

Key Features

Serverless Infrastructure

Automatic scaling: Scale to zero when idle
GPU access: Use A10G, A100, or other GPUs
Fast cold starts: Optimized container loading

Flexible Deployment

Any Model

Deploy any model from Hugging Face or custom weights

Custom Code

Full control over inference pipeline

Multiple Frameworks

PyTorch, JAX, TensorFlow, ONNX, etc.

Auto Scaling

Scale from zero to thousands of GPUs

Use Cases

Custom Models: Deploy proprietary or fine-tuned models
Batch Processing: Process large datasets efficiently
Research: Experiment with different model architectures
API Services: Build production-grade inference APIs
Data Processing: Run GPU-accelerated data pipelines

Cost Optimization

Since Modal charges by compute time:

Use smaller GPUs for testing
Implement caching to avoid redundant computation
Batch requests when possible
Scale to zero when idle (automatic)

Resources

Modal Platform

Access the platform

Documentation

View documentation

Examples

Browse example apps

Pricing

View pricing details

Free credits are provided monthly. They reset each month and do not roll over.

Limited Trial Offers

Overview

Free Tier

Enhanced Free Tier

Pricing Model

Credits Structure

Available Models

Getting Started

2. Add Payment Method (Optional)

4. Authenticate

5. Deploy Your First Function

6. Run Your Function

Key Features

Serverless Infrastructure

Flexible Deployment

Any Model

Custom Code

Multiple Frameworks

Auto Scaling

Use Cases

Cost Optimization

Resources

Modal Platform

Documentation

Examples

Pricing

Build docs developers (and LLMs) love

Limited Trial Offers

​Overview

Free Tier

Enhanced Free Tier

​Pricing Model

​Credits Structure

​Available Models

​Getting Started

​1. Sign Up

​2. Add Payment Method (Optional)

​3. Install Modal CLI

​4. Authenticate

​5. Deploy Your First Function

​6. Run Your Function

​Key Features

​Serverless Infrastructure

​Flexible Deployment

Any Model

Custom Code

Multiple Frameworks

Auto Scaling

​Use Cases

​Cost Optimization

​Resources

Modal Platform

Documentation

Examples

Pricing

Build docs developers (and LLMs) love

Overview

Pricing Model

Credits Structure

Available Models

Getting Started

1. Sign Up

2. Add Payment Method (Optional)

3. Install Modal CLI

4. Authenticate

5. Deploy Your First Function

6. Run Your Function

Key Features

Serverless Infrastructure

Flexible Deployment

Use Cases

Cost Optimization

Resources