Overview
Gemini powers the “brain” of Agentic AI, analyzing user intent to determine if spoken requests are actionable commands or casual conversation. This enables the system to intelligently route commands to ClawdBot for execution.Role in the System
Gemini has two uses in Agentic AI:Intent Analysis
Analyzes transcripts to determine if user wants to DO somethingModel:
gemini-3-flash-previewLocation: src/agenticai/core/conversation_brain.py:76Alternative Voice API
Can replace OpenAI Realtime for voice conversations (optional)Model:
gemini-2.5-flash-native-audio-latestLocation: src/agenticai/gemini/realtime_handler.py:13Primary Use: Intent Analysis
The Conversation Brain uses Gemini to classify user intent:How It Works
Location:src/agenticai/core/conversation_brain.py:310
Getting a Gemini API Key
Go to Google AI Studio
Create API key
Click Create API key or Get API keyThe key will be shown immediately (starts with
AIza...)Configuration
Add your Gemini API key to.env:
.env
config.yaml:
config.yaml
Intent Classification Examples
Here are examples of how Gemini classifies different requests:Actionable Commands (YES)
| User Says | Intent | Reason |
|---|---|---|
| ”Open YouTube” | Action | Command to open app |
| ”Play Shape of You on Spotify” | Action | Command to play music |
| ”Send hi to John on WhatsApp” | Action | Command to send message |
| ”Check my emails” | Action | Command to check emails |
| ”Search for nearby restaurants” | Action | Command to search web |
| ”What’s the weather today?” | Action | Query requiring external data |
| ”Set a timer for 5 minutes” | Action | Command to set timer |
Conversational (NO)
| User Says | Intent | Reason |
|---|---|---|
| ”Hello” | Conversation | Greeting |
| ”Thanks” | Conversation | Acknowledgment |
| ”How are you?” | Conversation | Small talk |
| ”That’s great” | Conversation | Reaction |
| ”Tell me a joke” | Conversation | Casual request (no external action) |
| “What can you do?” | Conversation | Question about capabilities |
Optimization: Quick Heuristics
The brain uses heuristics to skip Gemini calls for obvious cases: Location:src/agenticai/core/conversation_brain.py:325
- Saves 300-800ms per request
- Reduces Gemini API costs
- Improves response time for common commands
API Reference
ConversationBrain
Location:src/agenticai/core/conversation_brain.py:76
Usage Example
Alternative: Gemini Live API
Gemini also offers a Live API for voice conversations (alternative to OpenAI Realtime):Pros
- Free tier available - More generous than OpenAI
- Native audio - No separate transcription needed
- Multiple voices - Including “Zephyr”, “Puck”, etc.
Cons
- Less accurate transcription - Especially for proper nouns
- Higher latency - ~500-1000ms vs OpenAI’s 200-500ms
- Limited voice options - Fewer voices than OpenAI
Configuration
To use Gemini Live instead of OpenAI Realtime:config.yaml
Implementation Reference
Location:src/agenticai/gemini/realtime_handler.py:13
Conversation Memory
The brain maintains conversation context: Location:src/agenticai/core/conversation_brain.py:29
Context in Intent Analysis
Recent conversation context helps Gemini classify ambiguous requests:Cost and Pricing
Gemini API Pricing
Text Models (for intent analysis):| Model | Input | Output |
|---|---|---|
| gemini-3-flash-preview | Free (15 RPM) | Free (15 RPM) |
| gemini-2.0-flash | $0.075 / 1M tokens | $0.30 / 1M tokens |
| gemini-1.5-flash | $0.075 / 1M tokens | $0.30 / 1M tokens |
| Model | Input Audio | Output Audio |
|---|---|---|
| gemini-2.5-flash-native-audio | Free (15 RPM) | Free (15 RPM) |
Cost Comparison
Intent Analysis (per 1000 calls):- Gemini: ~0)
- Prompt size: ~100 tokens per call
- Response size: ~5 tokens (“YES” or “NO”)
- Gemini Live: Free (15 RPM limit)
- OpenAI Realtime: $0.30/min
Troubleshooting
API key invalid
Verify key format
Verify key format
Gemini API keys start with
AIza:Test authentication
Test authentication
Intent analysis too slow
Check heuristics
Check heuristics
Verify quick heuristics are working:Should see:
Use faster model
Use faster model
Switch to fastest Gemini model:
config.yaml
Commands not detected
Check Gemini response
Check Gemini response
Add more action keywords
Add more action keywords
Expand keyword list for instant detection:Location:
src/agenticai/core/conversation_brain.py:339Rate limit exceeded
Check free tier limits
Check free tier limits
Gemini free tier:
- 15 requests per minute (RPM)
- 1500 requests per day (RPD)
Upgrade to paid tier
Upgrade to paid tier
For higher limits, enable billing:
- Go to console.cloud.google.com
- Enable billing for your project
- Limits increase to 1000 RPM
Performance
Latency Breakdown
| Operation | Typical Latency |
|---|---|
| Quick heuristic (keyword match) | < 1ms |
| Quick heuristic (phrase match) | < 1ms |
| Gemini API call | 300-800ms |
| Average (with heuristics) | ~100ms |
Optimization Results
With heuristics enabled:- 60-70% of requests skip Gemini entirely
- ~500ms saved on average per request
- Lower API costs (fewer Gemini calls)
Next Steps
Conversation Brain
Deep dive into intent analysis
OpenClaw Gateway
Learn how commands are executed
OpenAI Realtime
Compare voice API options
Architecture
Understand the full system flow