Skip to main content

Prerequisites

Before installing ChatbotAI-Free, ensure you have the following:
  • Python 3.10 or 3.11 (Python 3.12+ not yet tested)
  • Ollama installed and running — ollama.ai
  • Git for cloning the repository
  • (Optional) NVIDIA GPU with CUDA for faster inference
ChatbotAI-Free runs on CPU, but GPU acceleration significantly improves response times for both LLM inference and voice synthesis.

Installation steps

1

Clone the repository

Clone the ChatbotAI-Free repository from GitHub:
git clone https://github.com/maximofraisinet/ChatbotAI-Free
cd ChatbotAI-Free
2

Create a virtual environment

Create and activate a Python virtual environment:
python3 -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate
Using a virtual environment keeps your dependencies isolated and prevents conflicts with other Python projects.
3

Install Python dependencies

Install the required packages from requirements.txt:
pip install -r requirements.txt
The key dependencies include:
  • PyQt6 (6.6.0+) - Modern GUI framework
  • faster-whisper (0.10.0+) - Real-time speech recognition
  • ollama (0.1.0+) - LLM inference client
  • kokoro-onnx (0.4.0+) - Neural TTS engine
  • PyMuPDF (1.23.0+) - PDF text extraction
  • tiktoken (0.5.0+) - Token counting
  • sounddevice & numpy - Audio I/O
The onnxruntime package in requirements.txt is CPU-only. For GPU acceleration, install the GPU version separately (see next step).
4

(Optional) Enable GPU acceleration

If you have an NVIDIA GPU with CUDA, install the GPU-accelerated ONNX runtime:
pip install onnxruntime-gpu
This will significantly speed up:
  • Kokoro TTS synthesis (300MB model runs much faster)
  • Whisper transcription (especially medium/large models)
The app automatically detects CUDA availability and uses GPU acceleration if available. If CUDA is not found, it falls back to CPU with no code changes needed.
5

Download Kokoro voice models

Kokoro v1.0 powers all built-in English and Spanish voices (54 voices total). The model files are too large for GitHub, so download them manually:
  1. Go to the kokoro-onnx releases page
  2. Download kokoro-v1.0.onnx (~300 MB)
  3. Download voices-v1.0.bin (~27 MB)
  4. Place both files in voices/kokoro-v1.0/:
voices/
└── kokoro-v1.0/
    ├── kokoro-v1.0.onnx    ← Neural TTS model
    └── voices-v1.0.bin     ← 54 voice embeddings
On first launch, the voice scanner checks the voices/ folder. If the Kokoro files are in place, you’re ready to go immediately.
6

Pull an Ollama model

Download at least one LLM model to power conversations. For example:
ollama pull llama3.1:8b
Recommended models:
  • llama3.1:8b - Good balance of speed and quality (8B parameters)
  • mistral:7b - Fast and efficient (7B parameters)
  • gemma2:9b - Google’s Gemma model (9B parameters)
  • qwen2.5:7b - Excellent multilingual support
ChatbotAI-Free automatically detects all available Ollama models. You can switch between them in the UI dropdown.
The lightest available model is automatically used to generate chat titles. Having a small model like llama3.1:8b or smaller ensures fast title generation.
7

Verify installation

Test that everything is working:
python main.py
The app should launch with:
  • ✓ Whisper model loaded (downloads on first run)
  • ✓ Ollama configured
  • ✓ TTS engine initialized
  • ✓ Voice scanner detecting Kokoro voices
On first run, Whisper will download the base model (~140 MB). This may take a few minutes depending on your internet connection.

Optional: Add more voices

Want voices in other languages beyond English and Spanish? You can add any Piper-compatible Sherpa-ONNX voice pack.
1

Install Sherpa-ONNX

pip install sherpa-onnx
2

Download a voice pack

Browse available voices at huggingface.co/csukuangfj. Download these files from the repo:
  • The .onnx model file
  • tokens.txt
  • The espeak-ng-data/ directory
3

Add to voices folder

Place the downloaded folder directly inside voices/:
voices/
├── kokoro-v1.0/
│   ├── kokoro-v1.0.onnx
│   └── voices-v1.0.bin
└── vits-piper-es_AR-daniela-high/  ← New Sherpa voice
    ├── es_AR-daniela-high.onnx
    ├── tokens.txt
    └── espeak-ng-data/
4

Restart and classify

On the next launch, the voice scanner detects the new folder and shows a dialog asking which language to assign. After confirmation, the voice appears in the voice selector dropdown.
The app identifies Sherpa packs by the presence of a .onnx file and an espeak-ng-data/ subdirectory.

Troubleshooting

Whisper model fails to load

If you see errors about missing CUDA or compute type:
  • Ensure you have PyTorch installed: pip install torch
  • The app automatically falls back to CPU if CUDA is unavailable

Ollama connection errors

If the app can’t connect to Ollama:
  • Verify Ollama is running: ollama list
  • Check that the Ollama service is active: ollama serve

No audio output

If TTS generates silence:
  • Verify Kokoro model files are in the correct location
  • Check audio device settings in the app’s Settings panel
  • On Linux, ensure PipeWire or PulseAudio is running

Next steps

Quick start

Learn how to start your first conversation

Build docs developers (and LLMs) love