AI Model Configurations
AI coding assistants use various large language models (LLMs) with specific configurations to balance performance, cost, and capabilities. This page documents the models and their configurations across different tools.Primary Models Used
Claude Sonnet 4
Provider: AnthropicUsed By: Cursor, Augment, Claude Code, AmpCapabilities:
- Advanced reasoning
- Long context (200k+ tokens)
- Tool use and function calling
- Multi-modal (text + images)
GPT-4.1
Provider: OpenAIUsed By: Cursor AgentCapabilities:
- High-quality code generation
- Structured output
- Function calling
- Vision capabilities
GPT-5
Provider: OpenAIUsed By: Amp (experimental)Capabilities:
- Enhanced reasoning
- Better context utilization
- Improved code understanding
- Faster inference
o3
Provider: OpenAIUsed By: Amp OracleCapabilities:
- Deep reasoning model
- Code reviews
- Architecture planning
- Complex debugging
Model Configurations
Amp Configuration (Claude 4 Sonnet)
- Multi-section system prompt
- Ephemeral caching for environment data
- Strict conciseness requirements
Amp Configuration (GPT-5)
- Emphasis on minimal reasoning
- Guardrails for safe operations
- Encrypted reasoning content
- Non-persistent storage
Cursor Agent Configuration
- Explicit knowledge cutoff date
- Multi-modal input support
- Autonomous agent mode
Claude Code Configuration
- Specific model version tracking
- Clear knowledge cutoff
- Minimal configuration overhead
Augment Code Configuration
- Explicit base model attribution
- Dynamic date injection
- Brand identity emphasis
Token Budget Management
Cursor Approach
- Large token budget for complex tasks
- Allows extensive context gathering
- Supports parallel tool operations
Claude Code Approach
- Aggressive token conservation
- Minimal explanations
- Direct, concise responses
Model Selection by Task
- Code Generation
- Code Review & Planning
- Semantic Search
- Fast Responses
Best Models: GPT-4.1, Claude Sonnet 4Why:
- Strong code completion
- Pattern recognition
- Syntax accuracy
- Multi-language support
Temperature and Sampling
Most tools use default or near-default temperature settings:| Tool | Temperature | Top P | Max Tokens | Notes |
|---|---|---|---|---|
| Cursor | Default | Default | 4096 | Standard settings |
| Claude Code | Default | Default | Variable | Optimized for conciseness |
| Amp | Default | Default | Variable | Context-dependent |
| Augment | Default | Default | 8192 | Longer responses allowed |
Function Calling & Tool Use
Anthropic Format (Claude)
- Parallel tool execution
- Namespaced functions
- Structured parameters
OpenAI Format (GPT)
- Unique call IDs
- JSON string arguments
- Sequential by default
Context Window Optimization
Caching Strategies
Ephemeral Caching (Claude)
Ephemeral Caching (Claude)
- Environment information
- File directory listings
- Repository context
- Static documentation
- Reduced token costs
- Faster response times
- Consistent context
Conversation Pruning
Conversation Pruning
Streaming and Real-Time Updates
Most tools support streaming responses:- Perceived faster responses
- Progressive rendering
- Early tool execution
- Better user experience
Cost Optimization Patterns
Minimize Reasoning
Parallel Execution
Tool Result Filtering
Aggressive Conciseness
Future Trends
Extended Context Windows
Models are moving toward:- 1M+ token context windows
- Better long-range coherence
- Reduced need for pruning
- Full repository context
Specialized Models
Trend toward role-specific models:- Fast models: Quick responses, simple tasks
- Reasoning models: Complex planning, reviews
- Code models: Optimized for programming
- Multimodal models: Code + diagrams + UI
Model configurations are frequently updated. Check with your tool’s documentation for the latest supported models and parameters.