The ConversationBrain class uses Gemini to understand user intent, extract actionable commands, and maintain conversation context throughout a call.
Overview
ConversationBrain provides:
- Real-time intent analysis from conversation transcripts
- Command detection and parameter extraction
- Conversation memory and context management
- Integration with ClawdBot agent for command execution
- Intelligent buffering of word-by-word transcripts into complete utterances
Constructor
from agenticai.core.conversation_brain import ConversationBrain
brain = ConversationBrain(
api_key="your-gemini-api-key",
model="gemini-3-flash-preview",
telegram_chat_id="123456789",
call_id="call-123"
)
Gemini API key for intent understanding and analysis.
model
str
default:"gemini-3-flash-preview"
Gemini model to use for intent analysis.
telegram_client
TelegramDirectClient
default:"None"
Legacy parameter, not used in current implementation.
Telegram chat ID for sending executable commands to ClawdBot agent.
Unique identifier for the call session.
Methods
set_callbacks
def set_callbacks(
self,
on_command: Callable[[str, dict], Awaitable[None]] | None = None,
on_clawdbot_response: Callable[[str], Awaitable[None]] | None = None,
)
Registers event callbacks for command detection and ClawdBot responses.
Async callback invoked when a command is detected. Receives (action: str, command: dict).
Async callback invoked with ClawdBot’s response text to speak it back to the user.
Example:
async def handle_command(action: str, command: dict):
print(f"Command detected: {action}")
print(f"Parameters: {command}")
async def speak_response(text: str):
print(f"ClawdBot says: {text}")
# Send to voice synthesis
brain.set_callbacks(
on_command=handle_command,
on_clawdbot_response=speak_response
)
add_assistant_transcript
def add_assistant_transcript(self, text: str)
Adds an assistant transcript fragment to the buffer. Gemini sends incremental fragments which are concatenated.
Transcript fragment from the assistant.
Example:
brain.add_assistant_transcript("Hello, ")
brain.add_assistant_transcript("how can I help you?")
add_user_transcript
def add_user_transcript(self, text: str)
Adds a user transcript fragment to the buffer. Fragments are accumulated until a complete turn is detected.
Transcript fragment from the user.
Example:
brain.add_user_transcript("Please send ")
brain.add_user_transcript("an email ")
brain.add_user_transcript("to John")
flush_assistant_turn
async def flush_assistant_turn(self)
Flushes the buffered assistant transcript as a complete conversation turn. Should be called when the assistant finishes speaking.
Example:
# After assistant finishes speaking
await brain.flush_assistant_turn()
flush_user_turn
async def flush_user_turn(self)
Flushes the buffered user transcript as a complete turn and analyzes intent. If the request is actionable, it’s sent to ClawdBot for execution.
Example:
# After user finishes speaking
await brain.flush_user_turn()
get_memory_summary
def get_memory_summary(self) -> str
Returns a formatted summary of the conversation memory.
Multi-line string containing call ID, turn count, extracted information, and recent conversation history.
Example:
summary = brain.get_memory_summary()
print(summary)
# Output:
# Call ID: call-123
# Total turns: 8
# Recent conversation:
# User: Can you send an email to John?
# Assistant: I'll send that email to John now.
# ...
def get_extracted_info(self) -> dict
Returns information extracted from the conversation (e.g., names, preferences, context).
Dictionary of extracted information from the conversation.
Example:
info = brain.get_extracted_info()
print(info)
# Output: {"recipient": "John", "subject": "Meeting tomorrow"}
send_call_summary
def send_call_summary(self, duration: float)
Logs a summary of the call including duration and detected commands.
Call duration in seconds.
Example:
brain.send_call_summary(duration=125.5)
# Logs: Call ended (126s) - Commands: ['send_message: Hello → John', 'search_web: restaurants']
Data Classes
ConversationTurn
Represents a complete conversation turn:
@dataclass
class ConversationTurn:
speaker: str # "user" or "assistant"
text: str # Complete utterance
timestamp: datetime # When the turn occurred
intent: str | None # Detected intent (e.g., "send_message", "search_web")
command: dict | None # Parsed command with parameters
ConversationMemory
Manages conversation history:
@dataclass
class ConversationMemory:
call_id: str # Call identifier
turns: list[ConversationTurn] # All conversation turns
context: dict # Persistent context
extracted_info: dict # Extracted information
def add_turn(self, speaker: str, text: str,
intent: str = None, command: dict = None) -> ConversationTurn
def get_recent_context(self, max_turns: int = 10) -> str
def to_summary(self) -> str
Intent Analysis
The brain uses intelligent heuristics and LLM-based analysis to determine if user requests are actionable:
Quick Heuristics
Non-actionable phrases (skips LLM call for efficiency):
- Greetings: “hi”, “hello”, “hey”
- Acknowledgments: “okay”, “thanks”, “yes”, “no”
- Short responses: Less than 3 characters
Action keywords (immediately actionable):
- Command verbs: “open”, “play”, “search”, “send”, “call”
- Service names: “email”, “youtube”, “spotify”, “google”
LLM Analysis
For ambiguous requests, the brain uses Gemini to classify intent:
# Returns: (intent, command_dict, is_actionable)
intent, command, actionable = await brain._analyze_intent(
"Can you play some music?"
)
# intent = "action"
# command = {"original_request": "Can you play some music?"}
# actionable = True
Complete Example
import asyncio
from agenticai.core.conversation_brain import ConversationBrain
async def main():
# Initialize brain
brain = ConversationBrain(
api_key="your-api-key",
telegram_chat_id="123456789",
call_id="demo-call"
)
# Set up callbacks
async def on_command(action: str, cmd: dict):
print(f"Executing: {action}")
async def on_response(text: str):
print(f"Speaking: {text}")
brain.set_callbacks(
on_command=on_command,
on_clawdbot_response=on_response
)
# Simulate conversation
brain.add_user_transcript("Send ")
brain.add_user_transcript("an email ")
brain.add_user_transcript("to Sarah")
# Flush when user stops speaking
await brain.flush_user_turn()
# Analyzes intent and sends to ClawdBot if actionable
# Get conversation summary
summary = brain.get_memory_summary()
print(summary)
asyncio.run(main())