Why Multi-Model?
Not all tasks require frontier model capabilities. By routing intelligently, you can:- Reduce costs by 50% while maintaining quality where it matters
- Increase speed by using fast models for mechanical work
- Cross-validate important decisions across multiple models
- Leverage strengths of different model architectures
Recommended Models
Frontier Models (Deep Reasoning)
| Model | Strengths | Best Used For |
|---|---|---|
| Claude Opus 4.6 | Deep reasoning, code quality, nuanced analysis | Coding, architecture, verification |
| Gemini 3.1 Pro | Broad knowledge, fast synthesis, strong planning | General work, research, planning |
| GPT-5.3 | Alternative perspective, creative tasks | Trilateral tiebreaker, creative work |
Fast Models (Mechanical Work)
| Model | Strengths | Best Used For |
|---|---|---|
| Gemini 3 Flash | Speed, low cost | Session management (/start, /end), quick lookups |
Cost Considerations
Athena is free and open source. You only pay for your AI subscription.Recommended plans:
- Claude Pro / Google AI Pro: ~$20/mo (full access to frontier models)
- Claude Max / Google AI Ultra: $200β250/mo (extended limits for power users)
Why Invest in Frontier Models?
Athenaβs protocols β governance, reasoning depth, structured workflows β are designed for models that can follow complex multi-step instructions. Smaller/free models may struggle to follow them consistently. This is a long-term investment, not a cost. Frontier models dramatically increase your output quality and consistency.The Routing Table
Route tasks based on complexity and risk:| Task Type | Recommended Tier | Why |
|---|---|---|
Session Management (/start, /end, /save) | β‘ Fast (Gemini Flash) | Mechanical execution, low reasoning needed |
| Coding & Implementation | π₯ Frontier (Claude Opus, Gemini Pro) | Code quality scales directly with model capability |
| Planning & Architecture | π₯ Frontier | Design decisions compound β invest best reasoning here |
| General Chat & Q&A | π§ Strong (Gemini Pro) | Good enough for most queries |
| Research & Deep Analysis | π₯ Frontier | Synthesis quality degrades with weaker models |
| Creative & Brainstorming | π§ Strong or π₯ Frontier | Use Strong for volume, Frontier for refinement |
| Verification & Code Review | π₯ Frontier (different model) | Use different model than author for fresh perspective |
| Quick Lookups & Formatting | β‘ Fast | Donβt waste Frontier tokens on simple tasks |
The Trilateral Feedback Loop
When two models disagree on a significant decision, bring in a third:When to Trigger
Architecture decisions
Choices with long-term consequences that are expensive to reverse
Risk assessments
When models disagree on severity or probability
Strategy choices
Both options seem equally valid but lead to different outcomes
High-stakes decisions
Any decision where the cost of being wrong is high
When NOT to Trigger
- Style preferences (just pick one)
- Low-stakes choices (not worth the tokens)
- When one modelβs answer is clearly more grounded
Cost Optimization Strategy
Key insight: Most of your session is NOT frontier-level work.| Session Phase | % of Tokens | Model Tier | Cost Impact |
|---|---|---|---|
/start boot | ~5% | β‘ Fast | Minimal |
| Exploration & chat | ~40% | π§ Strong | Moderate |
| Core reasoning & coding | ~40% | π₯ Frontier | Highest |
/end shutdown | ~5% | β‘ Fast | Minimal |
| Verification | ~10% | π₯ Frontier (alt) | Moderate |
Model Switching in Practice
In Multi-Model IDEs
Start with Fast model
Use Gemini Flash or similar for
/start boot scripts and session initialization.Switch to Frontier for complex work
When you hit coding, architecture, or deep analysis tasks, switch to Claude Opus or Gemini 3.1 Pro.
Drop back to Strong/Fast for routine tasks
For formatting, file operations, simple Q&A, use lower-tier models.
Cross-IDE Validation
For the trilateral loop, use different IDEs entirely:Anti-Patterns
| β Donβt | β Do Instead |
|---|---|
Use Frontier for /start and /end | Use Fast β itβs mechanical work |
| Use Fast for architecture decisions | Use Frontier β design compounds |
| Use one model for everything | Route by task type |
| Skip verification entirely | Use a different model to review critical code |
| Run trilateral loop on every question | Reserve it for high-stakes disagreements |
Quick Reference Card
Example Session Workflow
Session Start (Fast Model)
Planning Phase (Frontier Model)
Platform-Specific Tips
Antigravity / Multi-Model IDEs
Most modern agentic IDEs let you switch models mid-session via dropdown or command.Claude Code
Use.clauderc to define model presets:
ChatGPT / OpenAI
Switch models via--model flag or web interface.
Cost Savings Example
Scenario: 8-hour work day, 5 sessions
Without routing (all Frontier):- 5 sessions Γ 200K tokens avg = 1M tokens/day
- Cost: ~$30/day (estimate)
/start+/end: 10 sessions Γ 3K tokens = 30K (Fast)- Routine work: 400K tokens (Strong)
- Core work: 400K tokens (Frontier)
- Verification: 100K tokens (Frontier alt)
- Total Frontier: 500K tokens/day
- Cost savings: ~50% while maintaining quality
Best Practices
Default to Strong
Use Strong models (Gemini Pro) for general work. Only escalate to Frontier when needed.
Never Fast for architecture
Design decisions compound. Always use Frontier models for planning and architecture.
Fresh eyes for review
Use a DIFFERENT model to review code than the one that wrote it.
Track your patterns
Monitor which tasks genuinely benefit from Frontier vs Strong. Adjust routing over time.
Next Steps
Semantic Search
Learn how Athena finds context across your workspace
Best Practices
Operational discipline for running Athena sustainably