What are Cloud Models?
Cloud models are a new type of model in Ollama that:- Run remotely on Ollama’s cloud infrastructure
- Require no GPU on your local machine
- Use the same API as local models
- Work with existing tools and workflows
- Enable access to larger models that won’t fit on personal computers
Getting Started
Sign In to Ollama
Cloud models require an account on ollama.com. Sign in or create an account:Browse Cloud Models
Explore available cloud models at ollama.com/search?c=cloud.Running Cloud Models
- CLI
- Python
- JavaScript
- cURL
Run a cloud model directly from the terminal:Cloud models work exactly like local models:
Direct Cloud API Access
Cloud models can also be accessed directly via ollama.com’s API without a local Ollama installation. In this mode, ollama.com acts as a remote Ollama host.Authentication
Create API Key
Generate an API key at ollama.com/settings/keys.
List Available Models
Retrieve models available via the cloud API:Generate Responses
- Python
- JavaScript
- cURL
Install the library:Connect to the cloud API:
When using the direct cloud API, use model names without the
-cloud suffix (e.g., gpt-oss:120b instead of gpt-oss:120b-cloud).Privacy and Data Handling
Ollama is designed with privacy in mind:Local Models
- Run entirely on your machine
- No data sent to Ollama servers
- Complete privacy and control
Cloud Models
- Prompts and responses are processed to provide the service
- No storage or logging of prompt/response content
- No training on your data
- Basic account info and usage metadata collected (not including content)
- Data is never sold
- You can delete your account anytime
Disabling Cloud Features
If you prefer to run Ollama in local-only mode:Using Configuration File
Edit~/.ollama/server.json:
Using Environment Variable
- Cloud models will not be accessible
- All inference happens locally
- No connections to ollama.com services
Benefits of Cloud Models
No GPU Required
Run large models without expensive hardware
Access Larger Models
Use 120B+ parameter models that won’t fit locally
Same API
Works with existing Ollama tools and workflows
Flexible Deployment
Mix local and cloud models based on your needs
Pricing
For current pricing information and usage limits, visit ollama.com/pricing.Use Cases
Development and Testing
Quickly prototype with large models before deploying locally:Hybrid Workflows
Use local models for privacy-sensitive tasks and cloud models for heavy lifting:Resource-Constrained Environments
Run on devices without GPUs:FAQ
Do cloud models support all Ollama features?
Do cloud models support all Ollama features?
Yes, cloud models support the full Ollama API including streaming, embeddings, and multimodal inputs where applicable.
Can I switch between local and cloud models?
Can I switch between local and cloud models?
Absolutely. You can have both local and cloud models installed and switch between them at any time using the same commands.
What happens if I lose internet connectivity?
What happens if I lose internet connectivity?
Cloud models require an internet connection. If connectivity is lost, local models will continue to work normally.
Are cloud models faster than local models?
Are cloud models faster than local models?
It depends on your hardware. Cloud models have network latency but run on powerful infrastructure. Local models on high-end GPUs may be faster for small batches.
Can I use cloud models in production?
Can I use cloud models in production?
Yes, cloud models are designed for production use. Check the pricing page for rate limits and terms of service.
Next Steps
Model Library
Browse all available models
API Reference
Complete API documentation
Local Installation
Set up Ollama locally
Python SDK
Python integration guide