Overview
Tabby is built as a hybrid desktop application that combines Electron, Next.js, and FastAPI to create a system-wide AI keyboard layer. The architecture is designed to be unobtrusive yet powerful, running in the system tray while intercepting keyboard input globally.
Core Components
Tabby consists of four main layers that work together:Desktop Layer
Electron 38 application managing system-level features, global shortcuts, and window orchestration
Frontend Layer
Next.js 15 + React 19 UI with multiple specialized windows and real-time AI streaming
Backend API
Next.js API routes handling AI completions, chat, search, and MCP integration
Memory Service
FastAPI Python service managing persistent memory with Mem0, Supabase, and Neo4j
Architecture Layers
1. Electron Main Process
The Electron main process (frontend/electron/src/main.ts) orchestrates the entire application:
Window Management
Window Management
Creates and manages specialized windows:
- Main Window: Hidden background window for background tasks
- Action Menu: Quick AI popup triggered by
Ctrl+\ - Brain Panel: Memory dashboard (
Ctrl+Shift+B) - Suggestion Window: Inline AI suggestions
- Settings Window: Configuration and onboarding
Global Shortcuts
Global Shortcuts
Registers system-wide keyboard hooks:
Ctrl+\- Open action menuCtrl+Space- AI suggestionsCtrl+Shift+B- Brain panelAlt+X- Interview copilot- Custom shortcuts for all features
Desktop Automation
Desktop Automation
- Clipboard monitoring and manipulation
- Character-by-character typewriter mode (undetectable)
- Screenshot capture and encoding
- Window focus and positioning
IPC Communication
IPC Communication
Bidirectional communication between main and renderer processes for:
- Text insertion requests
- Context capture triggers
- Window show/hide commands
- Settings synchronization
2. Frontend UI Layer
The frontend is a Next.js 15 application with React 19, running inside Electron’s renderer process. Directory Structure:Real-time AI Streaming
Uses Vercel AI SDK’s
useChat and useCompletion hooks for streaming responses with proper token handling and error recovery.Context Management
Automatically captures screen context, selected text, and retrieves relevant memories to build conversation context.
3. Next.js Backend API
The shared API backend (nextjs-backend/) provides AI and integration services:
API Routes:
| Route | Purpose |
|---|---|
/api/chat | Streaming chat completions with context |
/api/completion | Single-turn text completions |
/api/suggest | Context-aware suggestions |
/api/interview-copilot | Coding interview assistance |
/api/search | Web search with Tavily |
/api/voice-agent | Voice-to-text and text-to-voice |
/api/transcribe | Speech recognition |
/api/auth | Supabase authentication |
4. Memory Backend Service
The FastAPI memory service (backend/main.py) manages persistent memory using Mem0:
Mem0 Configuration:
- LONG_TERM: Permanent preferences, identity, habits
- SHORT_TERM: Temporary states, current activities
- EPISODIC: Past events with specific time context
- SEMANTIC: General knowledge and facts
- PROCEDURAL: How-to knowledge and instructions
Data Flow
User Interaction Flow
All AI responses stream in real-time using Server-Sent Events (SSE), providing immediate feedback to users.
Database Architecture
Local Supabase (Docker)
Tabby runs a local Supabase instance via Docker instead of using cloud hosting:- PostgreSQL database (port 54322)
- PostgREST API (port 54321)
- Supabase Studio (port 54323)
- GoTrue auth server
- Realtime server
- Storage API
- Vector extension (pgvector)
context-captures: Screenshot images from interview copilotproject-assets: User-uploaded files and images
Neo4j Knowledge Graph (Optional)
When configured, Neo4j creates a knowledge graph of memories showing relationships between:- User preferences and habits
- Projects and technologies
- People and conversations
- Topics and contexts
@neo4j-nvl/react.
Process Communication
IPC (Inter-Process Communication)
Electron uses IPC to communicate between main and renderer processes: Main → Renderer:API Communication
All API calls use fetch with proper error handling and streaming support:http://localhost:8000:
Performance Considerations
Fast Startup
- Electron main process starts in under 1 second
- Next.js preloads in background
- Action menu appears instantly on shortcut
Low Memory Usage
- Single Electron instance
- Windows created on-demand
- React components lazy-loaded
Network Efficiency
- Streaming responses (no waiting for full completion)
- Local Supabase (no cloud latency)
- Memory search cached via LRU
Responsive UI
- Non-blocking typewriter mode
- Optimistic UI updates
- Background context capture
Security Model
Security Measures:- API keys stored in local environment files (never committed)
- Supabase runs locally (no data sent to cloud)
- Neo4j connection encrypted with TLS
- User data isolated by
user_id - No telemetry or external tracking
Deployment
Desktop App
Electron app built withelectron-builder:
Backend Services
- Memory API: Deployed to Azure Container Apps
- Next.js API: Deployed to Vercel
- Frontend: Packaged with Electron (standalone)
The packaged Electron app can run completely offline if using local AI models via MCP.
Next Steps
Technology Stack
Deep dive into frameworks and libraries used
Memory System
How Mem0, Supabase, and Neo4j work together