Skip to main content
Get your voice-powered research assistant up and running quickly. This guide walks you through installation, setup, and your first question.

Installation

Klaus is available via Homebrew for macOS and pipx for Windows. Both methods handle all dependencies automatically.
brew tap bgigurtsis/klaus
brew install klaus
macOS 26 + Python 3.14: Global hotkeys can crash on this combination. Klaus automatically disables global hotkeys and keeps in-app hotkeys active. Use Python 3.13 for stable global hotkeys.

First Launch

Launch Klaus from your terminal:
klaus
On first launch, you’ll see a 7-step setup wizard that configures everything you need.
1

Welcome

Read the introduction to Klaus and click Next to begin setup.
2

API Keys

Enter your API keys from three providers:
macOS: Keys are stored securely in Apple Keychain. Any existing plaintext keys in ~/.klaus/config.toml are automatically migrated.Windows: Keys are stored in ~/.klaus/config.toml.
3

Camera Selection

Choose your camera device. Klaus needs a camera to see what you’re reading.A USB document camera (visualiser) or phone on a gooseneck mount works best. Klaus auto-detects portrait orientation and rotates the image.
4

Microphone Test

Select your microphone and test the audio level. Speak into the mic and verify the level indicator responds.
5

Voice Model Download

Klaus downloads Moonshine Medium (~245MB), a local speech-to-text model. This happens on first use and runs entirely on your device with no API cost.
6

About You (Optional)

Add your background, research focus, or reading interests. Klaus uses this context to tailor responses.You can also configure your Obsidian vault path here for hands-free note-taking.
7

Setup Complete

Click “Start Klaus” to launch the main application.
macOS input monitoring: macOS may prompt you to grant your terminal Accessibility (input monitoring) permission. This is needed for global hotkeys to work when Klaus is not focused. You can deny the prompt and use the in-app UI buttons instead.

Ask Your First Question

Once Klaus is running:
  1. Place a document under your camera so Klaus can see the page
  2. Ask a question out loud or press the PTT key (default: F2)
  3. Wait 2-4 seconds for Klaus to read the page and respond
Klaus captures the page image, sends it to Claude’s vision API along with your transcript, reasons about the question, and responds aloud through text-to-speech.

Input Modes

Klaus supports two ways to interact:
Simply start speaking. Klaus detects your voice automatically via WebRTC VAD.
After a brief silence (~1.5s), it finalizes your question and starts processing.

Toggle key: § (macOS) or F3 (Windows)
Toggle between modes anytime:
  • Press the toggle key (§ on macOS, F3 on Windows)
  • Click the mode button in the UI
On macOS, F-keys trigger system actions by default (F3 = Mission Control). Press Fn + F3 to use the toggle, enable “Use standard function keys” in System Settings, or configure a different key in ~/.klaus/config.toml.

Updating Klaus

Keep Klaus up to date with the latest features and fixes:
brew upgrade klaus

Next Steps

Camera Setup

Configure document cameras, phone mounts, and Continuity Camera

Configuration

Customize hotkeys, TTS voice, camera settings, and more

Obsidian Integration

Set up hands-free note-taking to your vault

Usage Tips

Learn advanced features and optimize your workflow

Build docs developers (and LLMs) love