Launch the application
With your virtual environment activated, start ChatbotAI-Free:On first launch, Whisper downloads the
base model (~140 MB). This happens automatically and only needs to be done once.voices/ folder and detect available voices. If Kokoro files are present, you’ll see 54 English and Spanish voices available immediately.
Your first conversation
Select your model and voice
At the top of the window, you’ll see two dropdowns:
- Model selector (left): Choose your Ollama LLM (e.g.,
llama3.1:8b) - Voice selector (right): Choose a voice (e.g.,
af_bellafor English female)
The app automatically detects all available Ollama models. If you don’t see your model, make sure you’ve pulled it with
ollama pull <model-name>.Start with text or voice
You can interact in two ways:Text input:The app will:
- Type your message in the text box at the bottom
- Press Enter to send (or Shift+Enter for a new line)
- Click the 🎤 microphone button to start recording
- Speak your message
- Click the button again to stop and send
You can enable auto-send in Settings to automatically send when you stop speaking (using Voice Activity Detection).
- Transcribe your speech (if using voice)
- Show your message in a user bubble (right-aligned)
- Stream the AI’s response in real-time
- Generate and play TTS audio simultaneously
Understand the UI elements
Chat area:
- Your messages appear in gray bubbles on the right
- AI responses appear on the left with a ✨ avatar and markdown rendering
- Code blocks, tables, and formatting are rendered automatically
- “Ready” - Waiting for input
- “Transcribing…” - Converting speech to text
- “Thinking…” - LLM is generating response
- “Speaking…” - TTS is playing audio
- Shows conversation token usage as a colored ring
- Green < 50%, Yellow < 80%, Red > 80%
- Click to see detailed context window stats
Try interrupting the AI
While the AI is speaking, click the ⏹ Stop button to interrupt mid-response. This is useful when:
- The AI goes off-topic
- You want to ask a follow-up immediately
- The response is taking too long
In Live Mode (hands-free), you can naturally interrupt the AI by speaking—just like a real conversation. The app detects your voice and stops playback immediately.
Save and manage conversations
Conversations are auto-saved as Markdown files in the
chat_history/ folder.Sidebar navigation:- Click the ☰ hamburger button to toggle the chat history sidebar
- Click ➕ New Chat to start a fresh conversation
- Click any saved chat to resume it
- Right-click a chat to rename or delete it
- Generated automatically by the lightest available Ollama model
- Based on the first user message
- Can be renamed manually by right-clicking
Each chat maintains its own conversation history and context. Switching chats loads the full history and updates the context donut.
Explore conversation modes
Classic chat mode (default)
Turn-by-turn conversations with full control:- Type or speak your message
- See markdown-rendered responses in real-time
- Interrupt with the Stop button
- Perfect for detailed discussions, coding help, or document analysis
Live mode (hands-free)
Click the ✨ Live button for continuous conversation:- Completely hands-free interaction
- Voice Activity Detection (VAD) automatically detects when you stop speaking
- Natural barge-in: interrupt the AI mid-sentence by speaking
- The app mutes itself when you click the Live button again
Barge-in detection monitors your voice in real-time. When you start speaking while the AI is talking, playback stops immediately and the app starts listening to you.
Advanced features
Attach PDF documents
- Click the 📎 Attach button
- Select a PDF file from your computer
- Review the confirmation dialog showing:
- Total tokens in the document
- Current context usage
- Whether it fits in the model’s context window
- Click Inject to add it to the conversation
PDF text is extracted with PyMuPDF and injected directly into the conversation history. No vector database or RAG pipeline required.
Reading practice mode
Click the 📖 Practice button to enter shadowing coach mode:- Paste or type text you want to practice reading
- Click Start
- Read aloud while the app listens
- Watch each word change color:
- Green = pronounced correctly
- Red = mispronounced
- Gray = not yet read
- Click any word to hear its pronunciation via TTS
- When finished, see your grade (A+ → F) and accuracy stats
Adjust settings
Click the ⚙️ Settings button to customize:- Language: Switch between English and Spanish (affects STT and TTS)
- Voice speed: 0.5× to 2.0× playback speed
- Font size: Adjust chat text size
- Audio devices: Select input/output devices
- Recording mode: Auto-send or manual control
- Whisper model: Choose between base, small, medium, or large-v3
Keyboard shortcuts
| Key | Action |
|---|---|
| Enter | Send message |
| Shift+Enter | New line in text input |
| Esc | (In Live Mode) Mute/unmute |
Understanding the reasoning panel
When using thinking-capable models (like those with extended reasoning), you’ll see a ✨ Show Reasoning button below AI responses. Click to expand and see the model’s internal thought process:- Appears automatically when the model uses
<think>tags - Shows reasoning in italic gray text
- Collapsed by default to keep the UI clean
The reasoning panel only appears if your model supports structured thinking. Models without this capability will work normally without showing the panel.
Tips for better conversations
-
Start specific: Clear, specific prompts get better responses
- Good: “Explain quicksort in Python with examples”
- Poor: “Tell me about sorting”
-
Watch context usage: Keep an eye on the context donut
- Start a new chat when approaching 80-90%
- Long documents consume significant context
-
Use the right model: Smaller models are faster but less capable
llama3.1:8b- Good all-roundermistral:7b- Faster for simple tasks- Larger models for complex reasoning
-
Optimize voice settings:
- Use
baseWhisper for speed,mediumorlarge-v3for accuracy - Adjust voice speed if TTS is too fast/slow
- Test different voices to find your preference
- Use
-
Leverage markdown: The AI can format responses with:
- Code blocks with syntax highlighting
- Tables for structured data
- Headers, lists, and emphasis
Next steps
Configuration
Dive deeper into settings and customization options
Architecture
Learn about ChatbotAI-Free’s internal architecture