Installation
- macOS
- Linux
- Windows
Starting Ollama
After installation, start the Ollama service:Start the Ollama service
http://localhost:11434 (the default port).The Ollama service must be running for Quest to work. Keep this terminal window open.
Pulling Required Models
Quest uses two models depending on the mode:Pull the general model
For general queries, Quest uses qwen2.5-coder:1.5b:This model is configured in
rag_engine3.py:23:Pull the reasoning model
For reasoning mode, Quest uses deepseek-r1:7b (or the 1.5b variant):Or for faster performance on lower-spec machines:This model is configured in
rag_engine3.py:24:Configuring Ollama for Quest
Quest connects to Ollama via its REST API. The default configuration inrag_engine3.py is:
rag_engine3.py
Customizing Model Parameters
You can adjust these parameters when initializing the RAG engine:Verifying Installation
Troubleshooting
Ollama Service Not Running
Port Already in Use
Model Not Found
Out of Memory
Slow Response Times
If responses are slow, try:
-
Increase thread count:
-
Use GPU acceleration (if available):
-
Reduce context window:
Advanced Configuration
Using a Custom Ollama URL
If Ollama is running on a different host or port:app.py
Running Ollama as a Service
- Linux (systemd)
- macOS (launchd)
Create a systemd service file:Add:Enable and start:
Next Steps
Using the Web Interface
Learn how to interact with Quest through the Flask web interface
Query Optimization
Write effective queries for better results