Core Modules (klaus/)
config.py (527 lines)
Purpose: Configuration management via TOML + environment variables
Responsibilities:
- Load and parse
~/.klaus/config.toml - Fall back to
.envfor backward compatibility - Define constants for models, voice settings, system prompt
- Manage query router thresholds and feature flags
- Provide runtime setters for camera/mic device changes
- Dynamic system prompt assembly with user background
klaus/config.py:_build_system_prompt()):
- Base prompt defines Klaus’s role and response style
- Appends
user_backgroundfrom config if present - Injected into every Claude call
ENABLE_QUERY_ROUTER: toggle query routing (defaultTrue)ROUTER_LOCAL_CONFIDENCE_THRESHOLD: min confidence for local route (default 0.75)ROUTER_TIMEOUT_MS: LLM router timeout (default 350ms)
main.py (941 lines)
Purpose: Application entry point and component wiring
Responsibilities:
- Initialize PyQt6 application
- Gate first-run setup wizard
- Wire all components: Brain, Camera, Audio, TTS, Memory, Notes
- Manage hotkey listeners (pynput + Qt)
- Handle Qt signal bridge for thread-safe UI updates
- Implement live device switching with rollback on failure
- Reload API clients after settings changes
- Qt key events:
MainWindow.keyPressEvent/keyReleaseEvent(focus required) - pynput global hotkeys: Works system-wide but requires macOS Accessibility permission
- Both backends run in parallel; pynput starts gracefully with fallback on failure
klaus/main.py:_safe_slot):
abort() when exceptions escape signal handlers.
brain.py (440 lines)
Purpose: Claude vision + tool-use orchestration
Responsibilities:
- Manage conversation history
- Assemble context based on route decision
- Stream Claude responses sentence-by-sentence
- Execute tool calls (web search, notes)
- Enforce sentence caps for definitions
- Strip old images from history to save tokens
klaus/brain.py:115):
- Build user content: image block (optional) + text block
- Build system prompt: base + memory + notes + turn instruction + sentence cap
- Build message history: full or windowed based on route policy
klaus/brain.py:227):
klaus/brain.py:154):
- Max 5 rounds of tool use
- Each round: stream response → execute tools → append to messages → continue
- Tools:
web_search,set_notes_file,save_note
klaus/brain.py:388):
- Keep image in most recent user message only
- Strip from older messages to reduce token usage
query_router.py (458 lines)
Purpose: Hybrid local + LLM query classification
Responsibilities:
- Classify questions into route modes: standalone definition, page-grounded definition, general contextual
- Use fast local heuristics for most questions
- Fall back to LLM router with strict timeout for ambiguous cases
- Map route mode to context policy (image, history, memory, notes, sentence cap)
klaus/query_router.py:191):
- Pattern matching:
definition,doc_ref,deictic,spatial,concision,general - Weighted scoring for each route mode
- Confidence computed from top score vs. second score
- If confidence ≥ 0.75 and margin ≥ 0.20, use local decision
klaus/query_router.py:248):
- Model:
claude-haiku-4-5 - Timeout: 350ms (configurable)
- Max tokens: 80
- Returns JSON:
{mode, confidence, reason}
klaus/query_router.py:353):
audio.py (486 lines)
Purpose: Audio input and output
Responsibilities:
- Push-to-talk recording
- Voice-activated recording with WebRTC VAD
- Quality gates for VAD (duration, voiced ratio, RMS loudness, voiced run)
- Stream suspension/resumption for TTS
- Audio playback
klaus/audio.py:332):
min_duration: 0.5s (reject clicks/coughs)min_voiced_frames: 8 (reject 240ms of voiced content)min_voiced_ratio: 0.28 (reject mostly silence)min_voiced_run_frames: 6 (reject stuttered noise)min_rms_dbfs: -45.0 (reject very quiet utterances)
camera.py (164 lines)
Purpose: Background camera capture
Responsibilities:
- OpenCV background thread
- Frame capture with thread-safe access
- Auto-rotation for portrait frames
- Base64 JPEG encoding for Claude
- Thumbnail generation for chat UI
klaus/camera.py:45):
- Config:
camera_rotation=auto,none,90,180,270 auto: detects portrait (h > w) and rotates 90° CW- Fixed angles use OpenCV
cv2.rotate()
klaus/camera.py:120):
tts.py (248 lines)
Purpose: Text-to-speech with streaming playback
Responsibilities:
- OpenAI TTS API calls
- Sentence-level batching (max 4000 chars per API call)
- Persistent
sd.OutputStreamto avoid macOS crackling - Parallel synthesis + playback via queues
- Responsive stop via small write blocks
- Sentence queue: Main thread puts sentences, synthesis worker pulls
- Audio queue: Synthesis worker puts WAV bytes, playback loop pulls
- Persistent stream: One
sd.OutputStreamper session (avoids repeated create/destroy)
klaus/tts.py:233):
stt.py (103 lines)
Purpose: Local speech-to-text via Moonshine
Responsibilities:
- Load Moonshine model from cache
- Transcribe WAV bytes to text
- Model and language configurable
- First use triggers download to
~/.cache/moonshine/ - Setup wizard includes model download step with progress bar
memory.py (254 lines)
Purpose: SQLite persistence
Responsibilities:
- Create and migrate database schema
- Save/load sessions and exchanges
- Knowledge profile (unused, future feature)
- Generate session titles from first question
notes.py (100 lines)
Purpose: Obsidian vault integration
Responsibilities:
- Set active note file
- Append content to notes
- Expose
set_notes_fileandsave_notetools to Claude
klaus/notes.py:50):
search.py (50 lines)
Purpose: Tavily web search
Responsibilities:
- Execute web searches via Tavily API
- Format results for Claude
- Expose
web_searchtool
device_catalog.py (221 lines)
Purpose: Shared camera/mic enumeration
Responsibilities:
- Enumerate cameras with OpenCV
- Get native camera names via AVFoundation on macOS
- Enumerate microphones with sounddevice
- Disambiguate duplicate mic labels
- Mark system default devices
UI Modules (klaus/ui/)
theme.py (586 lines)
Purpose: Centralized styling
Responsibilities:
- Define palette tokens (colors, fonts, dimensions)
- Single
application_stylesheet()QSS string - Apply dark title bar on Windows via DWM API
- Load bundled Inter font files
main_window.py (204 lines)
Purpose: Top-level window layout
Responsibilities:
- Splitter layout: session panel | chat widget
- Header with settings button
- Qt key event forwarding for in-app hotkeys
chat_widget.py (260 lines)
Purpose: Scrollable chat feed
Responsibilities:
- Display message cards (user + assistant)
- Show page thumbnails when image context used
- Replay button for re-speaking responses
- Auto-scroll to bottom on new messages
session_panel.py (190 lines)
Purpose: Session list sidebar
Responsibilities:
- Display all sessions with titles
- Context menu: rename, delete
- Switch active session on click
- Relative timestamps (“5 minutes ago”)
setup_wizard.py (904 lines)
Purpose: First-run setup wizard
Responsibilities:
- 7-step wizard: welcome, API keys, camera, mic, model download, user background, done
- Validate API keys by format (prefix + length)
- Test camera/mic devices
- Download Moonshine model with progress bar
- Persist settings to
config.toml
settings_dialog.py (443 lines)
Purpose: Settings UI
Responsibilities:
- Tabbed dialog: API Keys, Devices, Profile
- Immediate apply for camera/mic (no Save button)
- Emit signals for device changes
- Reload API clients on Save
status_widget.py (120 lines)
Purpose: Status bar at bottom of window
Responsibilities:
- Display status: Idle, Listening, Thinking, Speaking
- Mode toggle button (PTT ↔ VAD)
- Stop button to interrupt TTS
- Color-coded status indicator
camera_widget.py (71 lines)
Purpose: Live camera preview
Responsibilities:
- Display camera feed at ~30 fps
- Convert OpenCV BGR → Qt QPixmap
- Scale to fit widget
Next Steps
- Query Routing — Deep dive into routing logic and policies
- Data Flow — End-to-end data flow walkthrough