Services
Service classes for AI chat, text-to-speech, speech-to-text, translation, summarization, rewriting, and content generation.AIService
Multi-provider AI service with unified interface for Chrome AI, OpenAI, and Ollama.Methods
config: Configuration objectconfig.provider: ‘chrome-ai’ | ‘openai’ | ‘ollama’config.chromeAi: Chrome AI settings (temperature, topK, enableImageSupport, enableAudioSupport)config.openai: OpenAI settings (apiKey, model, temperature, maxTokens)config.ollama: Ollama settings (endpoint, model, temperature, maxTokens)tabId: Tab ID (extension mode only)
messages: Array of message objects withroleandcontentoptions:{ systemPrompt?, temperature?, maxTokens?, images?, audios? }tabId: Tab ID (extension mode only)
Supported Providers
Chrome AI (Gemini Nano)
Chrome AI (Gemini Nano)
- On-device inference (no API key required)
- Multi-modal: Images and audio support
- Requirements: Chrome 138+ with flags enabled
- Model: Gemini Nano (2B-4B parameters)
OpenAI
OpenAI
- Cloud-based (API key required)
- Models: GPT-4, GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo
- Multi-modal: Images support (GPT-4 Vision)
- Audio: Via Whisper transcription
Ollama
Ollama
- Self-hosted (local server)
- Models: Llama 2, Mistral, Mixtral, etc.
- Multi-modal: Depends on model (LLaVA for images)
- Endpoint: Default
http://localhost:11434
TTSService
Multi-provider Text-to-Speech service with streaming generation and audio queue management.Methods
config.provider: ‘kokoro’ | ‘openai’ | ‘openai-compatible’config.kokoro: Kokoro settings (modelId, voice, speed, device)config.openai: OpenAI TTS settings (apiKey, model, voice, speed)config['openai-compatible']: Generic TTS API settingstabId: Tab ID (extension mode only)
text: Text to synthesizevoice: Voice ID (overrides config)chunkSize: Target chunk size in characters (default: 500)minChunkSize: Minimum chunk size (default: 100)sessionId: Session identifier for trackingtabId: Tab ID (extension mode only)
audioBlobUrls: Array of blob URLs to playsessionId: Session identifier
Supported Providers
Kokoro TTS (Local)
Kokoro TTS (Local)
- On-device synthesis using ONNX runtime
- Voices: 30+ high-quality neural voices
- Backends: WebGPU (fast) or WASM (compatible)
- Languages: English, Japanese, Chinese, Korean, French, Spanish
- Lip Sync: Automatic BVMD generation for 3D character
OpenAI TTS
OpenAI TTS
- Cloud-based (API key required)
- Models: tts-1 (fast), tts-1-hd (high quality)
- Voices: alloy, echo, fable, onyx, nova, shimmer
- Speed: Adjustable 0.25x to 4.0x
OpenAI-Compatible TTS
OpenAI-Compatible TTS
- Generic TTS API (Cartesia, ElevenLabs, etc.)
- Custom endpoint and voice configuration
- OpenAI SDK compatibility
STTService
Multi-provider Speech-to-Text service supporting one-shot transcription and continuous streaming.Methods
config.provider: ‘chrome-ai-multimodal’ | ‘openai’ | ‘openai-compatible’config['chrome-ai-multimodal']: Chrome AI settingsconfig.openai: OpenAI Whisper settingsconfig['openai-compatible']: Generic STT API settings
TranslatorService
Multi-provider translation service with support for Chrome AI Translator API and LLM-based translation.Methods
config.provider: ‘chrome-ai’ | ‘openai’ | ‘ollama’
text: Text to translatesourceLang: Source language code (e.g., ‘en’)targetLang: Target language code (e.g., ‘es’)tabId: Tab ID (extension mode only)
SummarizerService
Multi-provider text summarization service supporting Chrome AI Summarizer API and LLM-based summarization.Methods
text: Text to summarizeoptions.type: ‘tldr’ | ‘headline’ | ‘key-points’ | ‘teaser’options.format: ‘plain-text’ | ‘markdown’options.length: ‘short’ | ‘medium’ | ‘long’tabId: Tab ID (extension mode only)
RewriterService
Multi-provider text rewriting service with tone, format, and length adjustments.Methods
text: Text to rewriteoptions.tone: ‘as-is’ | ‘more-formal’ | ‘more-casual’ | ‘professional’ | ‘friendly’options.format: ‘as-is’ | ‘plain-text’ | ‘markdown’options.length: ‘as-is’ | ‘shorter’ | ‘longer’options.context: Additional context for rewritingtabId: Tab ID (extension mode only)
WriterService
Multi-provider content generation service for creating new text from prompts.Methods
prompt: Writing promptoptions.tone: ‘neutral’ | ‘formal’ | ‘casual’ | ‘professional’ | ‘friendly’options.format: ‘plain-text’ | ‘markdown’options.length: ‘short’ | ‘medium’ | ‘long’options.context: Additional contexttabId: Tab ID (extension mode only)
ChatHistoryService
Service for managing persistent chat history with message trees and media storage.Methods
chatData.chatId: Chat ID (auto-generated if not provided)chatData.chatService: ChatService instance (for tree structure)chatData.messages: Flat message array (backward compatibility)chatData.title: Chat title (auto-generated if not provided)chatData.isTemp: Skip saving if truechatData.metadata: Additional metadata
offset: Skip first N chats (default: 0)limit: Max chats to return (default: 20)
Usage Examples
AI Chat with Streaming
AI Chat with Streaming
Text-to-Speech with Kokoro
Text-to-Speech with Kokoro
Speech-to-Text Recording
Speech-to-Text Recording
Translation with Streaming
Translation with Streaming