Type Definitions
RuntimeMode
type RuntimeMode = "online" | "local";
Defines the execution mode for speech-to-text and AI models.
Uses cloud-based API services for processing
Uses locally installed models (Ollama, local STT)
CaptureMode
type CaptureMode = "single-tap" | "push-to-talk";
Defines how audio input is captured from the user.
Record starts and stops automatically
Hold key/button to record, release to stop
TtsEngine
type TtsEngine = "piper" | "coqui";
Specifies which text-to-speech engine to use.
Fast, lightweight TTS engine (default)
High-quality TTS with voice cloning (requires Python)
StyleProfile
type StyleProfile = "adaptive" | "professional" | "casual" | "concise" | "developer";
Controls the response style of the AI assistant.
Adjusts tone based on context
Formal, business-appropriate responses
Friendly, conversational tone
Brief, to-the-point answers
Technical, code-focused responses
ThemeMode
type ThemeMode = "system" | "light" | "dark";
Determines the application’s visual theme.
DictationLanguageMode
type DictationLanguageMode = "single" | "multiple";
Controls language support for dictation.
Support multiple languages from allow list
Stage
type Stage = "idle" | "recording" | "processing" | "speaking" | "error";
Represents the current state of the assistant pipeline.
Transcribing and generating response
An error occurred in the pipeline
Interfaces
HotkeySpec
interface HotkeySpec {
ctrl: boolean;
shift: boolean;
alt: boolean;
meta: boolean;
key: string;
label: string;
}
Defines a keyboard shortcut specification.
Whether Ctrl/Control key is pressed
Whether Shift key is pressed
Whether Alt/Option key is pressed
Whether Meta/Command/Windows key is pressed
The primary key (e.g., “Space”, “A”)
Human-readable representation (e.g., “Ctrl+Space”)
PersistedSettings
interface PersistedSettings {
// API Configuration
apiKey: string;
apiBaseUrl: string;
sttModelName: string;
aiModelName: string;
rememberApiKey: boolean;
// Runtime Modes
runtimeMode: RuntimeMode;
sttRuntimeMode: RuntimeMode;
aiRuntimeMode: RuntimeMode;
// Local Configuration
localOllamaBaseUrl: string;
localOllamaModel: string;
localSttModel: string;
// Input & Capture
captureMode: CaptureMode;
microphoneDeviceId: string;
pushToTalkHotkey: string;
commandHotkey: string;
// Dictation
dictationLanguage: string;
dictationLanguageMode: DictationLanguageMode;
dictationLanguageAllowList: string[];
autoPasteDictation: boolean;
dictationSoundEffects: boolean;
muteMusicWhileDictating: boolean;
// Assistant Behavior
styleProfile: StyleProfile;
systemPrompt: string;
temperature: number;
maxTokens: number;
commandMode: boolean;
wakeWordEnabled: boolean;
assistantName: string;
contextAwareness: boolean;
copyToClipboard: boolean;
incognitoMode: boolean;
// Text Processing
backtrackCorrection: boolean;
removeFillers: boolean;
autoPunctuation: boolean;
numberedLists: boolean;
// TTS Configuration (Piper)
ttsEngine: TtsEngine;
piperPath: string;
piperSpeed: number;
piperQuality: PiperQuality;
piperEmotion: PiperEmotion;
// TTS Configuration (Coqui)
coquiPythonPath: string;
coquiModelName: string;
coquiLanguage: string;
coquiVoiceId: string;
coquiSpeed: number;
coquiQuality: CoquiQuality;
coquiEmotion: CoquiEmotion;
coquiUseGpu: boolean;
coquiSplitSentences: boolean;
// UI & Display
launchAtLogin: boolean;
showFlowBar: boolean;
showAppInDock: boolean;
themeMode: ThemeMode;
}
Complete application settings stored in local storage.
API key for cloud services
Base URL for API endpoints
Speech-to-text model identifier
AI language model identifier
Whether to persist API key between sessions
Overall runtime mode (online/local)
Speech-to-text runtime mode
Local STT model identifier
Selected microphone device ID
Hotkey for push-to-talk activation
Primary dictation language code
Single or multiple language support
dictationLanguageAllowList
Allowed language codes for multi-language mode
Automatically paste transcribed text
Enable audio feedback during dictation
Pause music playback during dictation
Custom system prompt for AI
AI temperature parameter (0-2)
Maximum tokens in AI response
Enable command mode functionality
Enable wake word detection
Use screen/selection context in responses
Copy responses to clipboard
Disable conversation history
Enable voice backtrack correction
Remove filler words (um, uh, etc.)
Automatically add punctuation
Convert spoken numbers to formatted lists
Playback speed multiplier
Audio quality (fast/balanced/high)
Path to Python with Coqui TTS
Playback speed multiplier
Audio quality (fast/balanced/high)
Process text sentence-by-sentence
Start app when user logs in
Display floating status bar
Show app icon in dock/taskbar