Overview
SeanceAI is built with a modern Python backend and vanilla JavaScript frontend, leveraging AI models through the OpenRouter API for authentic historical figure conversations.Backend Architecture
Core Framework
Flask 3.0.0 - Lightweight Python web framework providing:- RESTful API endpoints for figure data and chat interactions
- Server-Sent Events (SSE) for real-time streaming responses
- Health check endpoints for deployment monitoring
- Error handling and request validation
AI Integration
OpenRouter API
SeanceAI uses OpenRouter to access multiple AI models through a single API:- Primary Model:
google/gemma-3-12b-it:free - Model Categories:
- Swift Tier (Free): Gemma 3 models, Llama 3.3 70B, Llama 3.1 405B
- Balanced Tier: GPT-4o Mini, Claude 3.5 Haiku, DeepSeek V3
- Advanced Tier: Claude Sonnet 4, GPT-4o, Gemini 2.5 Pro, Claude Opus 4
Model Compatibility
Some models don’t support the standardsystem role. SeanceAI handles this automatically:
Intelligent Model Fallback
SeanceAI implements sophisticated retry logic to ensure reliable service:Rate Limit Handling
When a model hits rate limits, the system automatically:- Retries the same model with exponential backoff (2s, 5s, 10s delays)
- Falls back to alternative models if retries fail
- Provides user-friendly error messages
Fallback Flow
Server-Sent Events (SSE) Streaming
Real-time streaming provides immediate feedback to users:- Content chunks:
data: {"content": "text"} - Completion:
data: {"done": true} - Errors:
data: {"error": "message", "rate_limited": true}
Conversation Management
History Limiting To optimize API costs and response times:- Maximum 20 messages kept in context window
- Older messages are automatically pruned
Frontend Architecture
Technology Stack
- Vanilla JavaScript - No framework dependencies, pure ES6+
- Modern CSS - Custom properties, Grid, Flexbox
- LocalStorage - Client-side conversation persistence
- Responsive Design - Mobile-first approach
Key Features
- Real-time Streaming - SSE EventSource API for live responses
- Conversation Management - Save, resume, export conversations
- AI-Generated Suggestions - Contextual follow-up questions
- Dinner Party Mode - Multi-figure conversations with dynamic parsing
Historical Figure System
Figure Prompt Template
Each historical figure uses a structured system prompt:Figure Data Structure
Each figure inHISTORICAL_FIGURES dictionary contains:
id- Unique identifiername- Full nametitle- Role/professionbirth_year/death_year- Lifespan for era-appropriate knowledgeera- Historical periodpersonality- Speaking style and character traitsbeliefs- Core values and documented opinionstagline- Brief descriptorstarter_questions- Conversation prompts
API Endpoints
Core Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | / | Serve main HTML page |
GET | /api/figures | Return list of all historical figures |
GET | /api/figures/<id> | Return single figure data |
GET | /api/models | List available AI models |
GET | /api/health | Health check endpoint |
POST | /api/chat | Send message, receive AI response |
POST | /api/chat/stream | Streaming chat endpoint (SSE) |
POST | /api/dinner-party/chat | Multi-figure conversation |
POST | /api/suggestions | Get contextual follow-up questions |
Example Request
Deployment
Production Server
Gunicorn with gevent workers for async support:Environment Configuration
Required environment variables:OPENROUTER_API_KEY- API key from OpenRouterPORT- Server port (default: 5000)FLASK_DEBUG- Enable debug mode (default: false)
Platform Support
- Railway.app (Recommended) - Auto-detected Flask app
- Fly.io - Configuration included
- Heroku - Standard Python buildpack
- Render - Gunicorn detected automatically
- AWS/GCP - Container or serverless deployment
Performance Optimizations
- Streaming Responses - Reduces perceived latency
- Model Fallback - Ensures service availability
- History Limiting - Controls API costs and response times
- Fast Suggestion Model - Uses lightweight model for quick suggestions
- Client-side Caching - LocalStorage for conversation history
Security Considerations
- API key stored in environment variables
- No client-side API key exposure
- Input validation on all endpoints
- Rate limiting handled by OpenRouter
- CORS headers configured for production