Model Selection
Automatic selection (default): Scira analyzes your query and selects the optimal model based on:- Task complexity (reasoning, generation, analysis)
- Response time requirements
- Cost efficiency
- Specialized capabilities needed
- Specific model capabilities (vision, function calling, extended context)
- Consistency across sessions
- Cost optimization
- Provider preferences
Supported Providers
xAI
Models from X’s AI team with deep X (Twitter) integrationGrok Models
Grok Models
- Grok 3: High-performance reasoning model
- Grok 3 Mini: Faster, cost-effective variant
- Grok 4: Latest flagship with enhanced capabilities
- Grok 4 Fast: Optimized for speed with Grok 4 quality
- Grok 4.1 Fast: Incremental improvements to Grok 4 Fast
- Grok Code: Specialized for code generation and analysis
- Native X (Twitter) search integration
- Real-time access to X posts and trends
- xSearch tool support
- XQL query language (Pro)
- Social media research and analysis
- Real-time information needs
- Code generation with Grok Code
- General reasoning with Grok 4
OpenAI
Industry-leading models from OpenAIGPT 4 Series
GPT 4 Series
GPT 4.1
- GPT 4.1 Nano: Smallest variant for simple tasks
- GPT 4.1 Mini: Balanced performance and cost
- GPT 4.1 Standard: Full capabilities
- General-purpose research
- Content generation
- Analysis and summarization
GPT 5 Series
GPT 5 Series
GPT 5
- GPT 5 Nano: Entry-level GPT 5
- GPT 5 Mini: Cost-effective with GPT 5 quality
- GPT 5 Medium: Enhanced capabilities
- GPT 5 Standard: Full GPT 5 performance
- GPT 5.1 Instant: Fastest responses
- GPT 5.1 Thinking: Reasoning-focused variant
- GPT 5.1 Codex: Code generation specialist
- GPT 5.2 Instant: Latest fast variant
- GPT 5.2 Thinking: Enhanced reasoning
- GPT 5.2 Codex: Latest code specialist
- Advanced reasoning tasks
- Code generation (Codex variants)
- Quick responses (Instant variants)
- Deep analysis (Thinking variants)
o-series Models
o-series Models
- o3: Advanced reasoning model
- o4 mini: Compact reasoning variant
- Mathematical reasoning
- Complex problem-solving
- Multi-step logic
GPT OSS
GPT OSS
- GPT OSS 20B: Open-source 20 billion parameter model
- GPT OSS 120B: Open-source 120 billion parameter model
- Self-hosted deployments
- Custom fine-tuning
- Research applications
Anthropic
Claude models known for safety and helpfulnessClaude 4.5 Series
Claude 4.5 Series
- Claude Haiku 4.5: Fast and cost-effective
- Claude Sonnet 4.5: Balanced performance
- Claude Opus 4.5: Most capable Claude 4.5
- Long-form content generation
- Analysis and research
- Creative writing
- Safe and helpful responses
Claude 4.6
Claude 4.6
- Claude Opus 4.6: Latest flagship model
- Advanced reasoning
- Extended context handling
- Complex analysis
Gemini 2.5 Series
Gemini 2.5 Series
- Gemini 2.5 Flash Lite: Fastest variant
- Gemini 2.5 Flash Standard: Balanced Flash
- Gemini 2.5 Pro: Full capabilities
- Fast responses (Flash variants)
- Multimodal tasks
- General research
Gemini 3 Series
Gemini 3 Series
- Gemini 3 Flash: Latest fast model
- Gemini 3 Pro: Latest pro model
- Cutting-edge performance
- Advanced multimodal
- Long context windows
Alibaba (Qwen)
Qwen models from Alibaba CloudQwen 3 Series
Qwen 3 Series
Base Models
- Qwen 3 4B: 4 billion parameters
- Qwen 3 32B: 32 billion parameters
- Qwen 3 235B: 235 billion parameters
- Qwen 3 VL: Vision-language model
- Qwen 3 Max: Maximum performance
- Qwen 3 Next 80B: Next-generation 80B
- Qwen 3 Coder Small: Entry-level code model
- Qwen 3 Coder Standard: Balanced code generation
- Qwen 3 Coder Plus: Enhanced capabilities
- Qwen 3 Coder Next: Latest code model
- Code generation (Coder variants)
- Multilingual support
- Vision tasks (VL variant)
- Cost-effective deployment (smaller models)
Mistral
Open and commercial models from Mistral AIMinistral Series
Ministral Series
- Ministral 3 3B: 3 billion parameter compact model
- Ministral 3 8B: 8 billion parameter model
- Ministral 3 14B: 14 billion parameter model
- Efficient local deployment
- Cost-sensitive applications
- Edge computing
Flagship Models
Flagship Models
- Mistral Large 3: Latest large model
- Mistral Medium: Balanced performance
- Magistral Small: Compact variant
- Magistral Medium: Medium variant
- General research
- European language support
- Function calling
Devstral Series
Devstral Series
- Devstral 2 Small: Entry-level code model
- Devstral 2 Standard: Standard code generation
- Code generation
- Developer tools
- Technical documentation
DeepSeek
Advanced models from DeepSeek AIDeepSeek v3 Series
DeepSeek v3 Series
- DeepSeek v3: Base flagship model
- DeepSeek v3.1 Terminus: Enhanced v3.1
- DeepSeek v3.2: Latest iteration
- Deep reasoning
- Research tasks
- Technical analysis
DeepSeek R1 Series
DeepSeek R1 Series
- DeepSeek R1: Reasoning-focused model
- DeepSeek R1 0528: May 2028 release
- Mathematical reasoning
- Multi-step problems
- Logical analysis
Zhipu (GLM)
GLM models from Zhipu AIGLM Series
GLM Series
- GLM 4.5: Base model
- GLM 4.5 Air: Lightweight variant
- GLM 4.6: Enhanced capabilities
- GLM 4.6V: Vision-enabled
- GLM 4.7: Latest flagship
- GLM 4.7 Flash: Fast variant
- Chinese language tasks
- Multilingual support
- Vision tasks (V variants)
- Fast responses (Flash/Air)
Cohere
Enterprise-focused models from CohereCommand Series
Command Series
- Command A: Advanced reasoning
- Command A Thinking: Reasoning-focused
- Enterprise applications
- RAG (Retrieval-Augmented Generation)
- Embedding and reranking (used in file query tool)
Additional Providers
MoonShot (Kimi)
MoonShot (Kimi)
- Kimi K2: Base model with extended context
- Kimi K2.5: Enhanced K2
- Long context handling (200K+ tokens)
- Document analysis
- Extended conversations
Minimax
Minimax
- M1 80K: 80K context window
- M2: Next generation
- M2.1: Incremental update
- M2.1 Lightning: Fast variant
- Long documents
- Fast processing (Lightning)
ByteDance (Seed)
ByteDance (Seed)
- Seed 1.6: Base model
- Seed 1.6 Flash: Fast variant
- Seed 1.8: Latest version
- General tasks
- Cost-effective deployment
Arcee
Arcee
- Trinity Mini: Compact model
- Trinity Large: Large variant
- Domain-adapted models
- Specialized tasks
Others
Others
- Vercel v0 1.0: Code generation
- Vercel v0 1.5: Enhanced code gen
- Amazon Nova 2 Lite: AWS model
- Xiaomi Mimo V2 Flash: Fast model
- StepFun Step 3.5 Flash: Reasoning model
- Kwaipilot KAT-Coder-Pro V1: Code specialist
Model Capabilities
Function Calling
Most modern models support function calling (tool use), enabling:- Automatic tool selection
- Structured output generation
- Multi-step workflows
- Parallel function execution
- OpenAI GPT 4.1+, GPT 5+
- Anthropic Claude 4.5+
- xAI Grok 3+
- Google Gemini 2.5+
Vision Capabilities
Multimodal models that process images:- Qwen 3 VL: Vision-language model
- GLM 4.6V: Vision-enabled GLM
- Google Gemini series: Native multimodal
- OpenAI GPT-4 vision variants: Image understanding
- Image analysis and description
- OCR and text extraction
- Visual question answering
- Chart and diagram interpretation
Extended Context
Models with long context windows (100K+ tokens):- Kimi K2/K2.5: 200K+ tokens
- Minimax M1 80K: 80K tokens
- Claude 4.5/4.6: 200K tokens
- GPT-4 variants: 128K tokens
- Gemini Pro: 1M+ tokens
- Long document analysis
- Book summarization
- Extended conversations
- Large codebase understanding
Reasoning Models
Specialized for complex reasoning:- OpenAI o3, o4: Advanced reasoning
- GPT 5.1/5.2 Thinking: Reasoning variants
- DeepSeek R1: Reasoning-focused
- Cohere Command A Thinking: Enterprise reasoning
- StepFun Step 3.5: Multi-step reasoning
- Mathematical proofs
- Multi-step problems
- Logic puzzles
- Complex analysis
Code Specialists
Optimized for code generation and analysis:- xAI Grok Code: X-integrated code model
- OpenAI Codex variants: GPT 5.1/5.2 Codex
- Qwen 3 Coder series: All Coder variants
- Mistral Devstral 2: Code generation
- Kwaipilot KAT-Coder-Pro: Code specialist
- Vercel v0: UI code generation
- Code generation
- Bug fixing
- Code review
- Documentation generation
Model Selection by Use Case
General Research
Recommended: GPT 5 Standard, Claude Sonnet 4.5, Grok 4- Balanced performance and cost
- Strong reasoning capabilities
- Good citation generation
Code Generation
Recommended: Grok Code, GPT 5.2 Codex, Qwen 3 Coder Plus- Specialized for code tasks
- Best practices awareness
- Multi-language support
Fast Responses
Recommended: GPT 5.1 Instant, Gemini 2.5 Flash, Grok 4 Fast- Optimized for speed
- Lower latency
- Good for chat interactions
Deep Analysis
Recommended: GPT 5.2 Thinking, Claude Opus 4.6, DeepSeek R1- Advanced reasoning
- Multi-step analysis
- Complex problem solving
Long Documents
Recommended: Kimi K2.5, Claude Opus 4.6, Gemini 3 Pro- Extended context windows
- Document understanding
- Cross-reference capabilities
Cost Optimization
Recommended: Ministral 3 8B, Qwen 3 32B, Seed 1.6 Flash- Open-source or cost-effective
- Self-hosting options
- Good performance/cost ratio
Social Media Research
Recommended: Grok 4, Grok 4 Fast- Native X integration
- Real-time data access
- Social context understanding
Provider Configuration
Scira uses the Vercel AI SDK for model integration. Supported providers are configured via:Performance Considerations
Latency: Flash/Instant/Fast variants offer lower latency- Gemini Flash: ~200-500ms
- GPT 5.1 Instant: ~300-600ms
- Grok 4 Fast: ~400-700ms
- Ministral 3B-14B: High throughput
- Qwen 3 4B-32B: Scalable
- Standard variants: Moderate throughput
- Open models (Qwen, Mistral): Self-host cost only
- Mini/Small variants: Lower API costs
- Flagship models: Premium pricing
Next Steps
Search Modes
Learn how different modes use different models
Tools
Explore tools that leverage these models
