Overview
Cohere provides enterprise-grade language models specialized for business applications, including powerful chat models, best-in-class embeddings, and reranking capabilities. Access Cohere through Portkey for production-ready NLP. Base URL:https://api.cohere.ai
Supported Features
- ✅ Chat Completions (v2 API)
- ✅ Streaming
- ✅ Embeddings
- ✅ Rerank (via Cohere API)
- ✅ Tool Use (Function Calling)
- ✅ Document Mode (RAG)
- ✅ Citation Mode
- ✅ Batch Embeddings
- ❌ Image Generation
- ❌ Vision
Quick Start
Chat Completions
Streaming
Available Models
Chat Models
| Model | Context | Description | Best For |
|---|---|---|---|
command-r-plus-08-2024 | 128K | Most capable | Complex tasks, RAG |
command-r-08-2024 | 128K | Efficient | General purpose |
command-r-plus | 128K | Previous generation | Legacy apps |
command-r | 128K | Previous generation | Legacy apps |
command | 4K | Legacy model | Simple tasks |
command-light | 4K | Lightweight | Fast responses |
Embedding Models
| Model | Dimensions | Description |
|---|---|---|
embed-english-v3.0 | 1024 | English embeddings |
embed-multilingual-v3.0 | 1024 | 100+ languages |
embed-english-light-v3.0 | 384 | Compact English |
embed-multilingual-light-v3.0 | 384 | Compact multilingual |
embed-english-v2.0 | 4096 | Legacy |
Cohere excels at:
- Enterprise deployments with strong support
- RAG applications with citation support
- Multilingual tasks (100+ languages)
- Semantic search with best-in-class embeddings
- Document grounding for factual responses
Configuration Options
Headers
| Header | Description | Required |
|---|---|---|
Authorization | Cohere API key (Bearer token) | Yes |
Advanced Features
Tool Use (Function Calling)
RAG with Document Grounding
Cohere excels at RAG with built-in citation support:Embeddings
Embedding Input Types
Optimize embeddings for your use case:| Input Type | Use Case |
|---|---|
search_document | Indexing documents for search |
search_query | Search queries |
classification | Text classification |
clustering | Document clustering |
Legacy Completions API
For older command models:Fallback Configuration
Fallback to GPT-4 if Cohere fails:Load Balancing
Balance between Command R+ and Command R:Error Handling
Best Practices
- Use RAG mode - Leverage document grounding for factual accuracy
- Enable citations - Track sources for enterprise use
- Choose right embedding type - Use appropriate input_type for embeddings
- Use Command R+ - For complex tasks requiring reasoning
- Use Command R - For cost-effective general purpose tasks
- Batch embeddings - More efficient than individual requests
- Implement streaming - Better UX for long responses
- Handle tool calls - Multi-step reasoning with function calling
Enterprise Features
- Data privacy: Cohere doesn’t train on customer data
- Regional deployment: Available in multiple regions
- SOC 2 Type II: Enterprise compliance
- Custom deployments: Private cloud options
- SLA support: Enterprise support plans
- Fine-tuning: Custom model training
Pricing
Cohere offers competitive pricing with a free trial:Cohere Pricing
View detailed pricing for all Cohere models
Related Resources
Embeddings Guide
Working with embeddings
RAG Guide
Building RAG applications
Function Calling
Tool use and function calling
Fallbacks
Fallback configurations