System architecture
Viber’s architecture consists of four main components that work together to enable voice-driven development:
Component flow
- User Device: Captures voice input and displays the real-time UI with live preview
- ElevenLabs Voice Agent: Manages the conversation loop, handles tool calls, and provides narrated status updates
- Gemini Code Agent: Performs architecture analysis and code implementation with context-aware generation
- Daytona Sandbox: Executes the generated Vite application and serves preview URLs
Data flow
Voice agent processing
ElevenLabs voice agent interprets the intent and decides whether to:
- Call the
vibe_buildtool to generate/edit code - Call the
navigate_uitool to change the UI panel - Respond conversationally
Code generation
When
vibe_build is triggered, the Gemini code agent:- Analyzes the request (create vs. edit)
- Loads relevant file context for edits
- Generates code using structured prompts
- Streams results back as XML-tagged files
Sandbox application
Generated files are written to the Daytona sandbox:
- Files are uploaded via the Daytona SDK
- Vite dev server hot-reloads the changes
- Preview URL updates in real-time
Key architectural patterns
Client-side tool calling
The voice agent uses client-side tools to trigger actions without server round-trips:Component-based code generation
Viber enforces a component-based architecture for generated apps:- Landing pages are broken into section components (Header, Hero, Features, Footer)
- App.tsx imports and composes these sections
- Edits target specific component files rather than regenerating everything
This architectural pattern enables surgical edits - when a user says “update the hero section”, only
Hero.tsx is regenerated, preserving all other code.Streaming code generation
Code is generated and applied incrementally:Context-aware file selection
For edit operations, Viber uses an LLM-based intent analyzer to select relevant files:Technology stack
Voice
- ElevenLabs Conversational AI
- WebSocket connections
- Client-side tool calling
Code generation
- Google Gemini (gemini-3-pro)
- Vercel AI SDK
- XML-based file parsing
Sandbox
- Daytona SDK
- Vite dev server
- HMR over WebSocket
Frontend
- React 18 with TypeScript
- TanStack Start (full-stack framework)
- Tailwind CSS v4
Next steps
Voice agent
Learn about the ElevenLabs voice agent and tool calling
Code agent
Explore Gemini code generation and prompts
Sandbox
Understand Daytona sandbox management