Skip to main content
Quest provides a modern, interactive web interface built with Flask. This guide covers all features and interactions available through the UI.

Starting the Application

1

Ensure Ollama is running

Before starting Quest, make sure Ollama is running:
ollama serve
See the Setting Up Ollama guide for installation instructions.
2

Navigate to the project directory

cd ~/workspace/source
3

Start the Flask application

python app.py
You should see output like:
* Running on http://127.0.0.1:5000
* Restarting with stat
* Debugger is active!
4

Open in browser

Visit http://localhost:5000 in your browser.

Interface Overview

The Quest interface consists of three main areas: The header contains control buttons:
  • Mode Toggle - Switch between Normal and Reasoning modes
  • Theme Toggle - Cycle through Light, Dark, and Reasoning themes
  • Clear History - Clear conversation history

Messages Area

Displays the conversation between you and Quest:
  • User messages appear on the right in blue
  • Assistant messages appear on the left with syntax highlighting
  • Code blocks include a copy button for easy copying

Input Area

The input box at the bottom where you type queries:
  • Auto-expands as you type
  • Press Enter to send
  • Press Shift+Enter for new line
  • Send button (arrow icon) to submit
  • Stop button appears during generation

Making Queries

1

Type your query

Enter your question in the input box. For example:
Explain the Two Sum problem
Or be more specific:
Show me a Python solution for the Longest Substring Without Repeating Characters problem
2

Submit the query

Press Enter or click the send button. The query is sent to the /search endpoint:
templates/index.html
const response = await fetch('/search', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json'
    },
    body: JSON.stringify({
        query: query.trim(),
        mode: currentMode
    })
});
3

View the response

Quest will:
  1. Show a loading spinner
  2. Retrieve relevant solutions from the vector database
  3. Generate a response using the selected LLM
  4. Display the formatted response with syntax highlighting

Switching Modes

Quest supports two modes, each using a different LLM:

General Mode (Default)

  • Model: qwen2.5-coder:1.5b
  • Best for: Quick, straightforward answers
  • Response style: Direct and concise
  • Theme: Light or Dark
Click the mode button in the header (shows “Normal”) to toggle.
app.py
@app.route('/set_mode', methods=['POST'])
def set_mode():
    data = request.get_json()
    mode = data.get('mode')
    
    if mode not in ["general", "reasoning"]:
        return jsonify({"error": "Mode must be 'general' or 'reasoning'."}), 400
    
    rag_engine.set_mode(mode)
    return jsonify({"message": f"Mode set to: {mode}"}), 200

Viewing Conversation History

Quest maintains conversation history automatically.

How History Works

The system stores the last 3 conversation turns (configurable):
app.py
retriever = LeetCodeRetriever()
rag_engine = RAGEngine(retriever, max_history=3)

Retrieving History

History is loaded when you open the page:
templates/index.html
async function loadHistory() {
    const history = await getHistory();
    if (history) {
        const messages = document.getElementById('messages');
        messages.innerHTML = '';
        history.forEach(entry => {
            addMessage(entry.query, true);
            addMessage(entry.response, false);
        });
    }
}

Clearing History

Click the trash icon in the header to clear all conversation history:
app.py
@app.route('/clear_history', methods=['POST'])
def clear_history():
    """Clear the conversation history."""
    rag_engine.conversation_history.clear()
    return jsonify({"message": "Conversation history cleared"}), 200
Clearing history removes context from future queries. You may need to re-explain your problem.

Stopping Generation

If a response is taking too long or going in the wrong direction:
1

Click the stop button

The red stop button appears next to the input during generation.
2

Generation halts immediately

The frontend aborts the request:
templates/index.html
async function stopGeneration() {
    if (abortController) {
        abortController.abort();
        
        const response = await fetch('/stop', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            }
        });
    }
}
3

Backend stops streaming

The RAG engine stops the generation process:
app.py
@app.route('/stop', methods=['POST'])
def stop():
    """Stop the ongoing generation process."""
    rag_engine.stop()
    return jsonify({"message": "Streaming stopped"}), 200
rag_engine3.py
def call_ollama(self, prompt: str) -> str:
    # Inside streaming loop
    for line in response.iter_lines():
        if line:
            if self.stop_generation:
                logger.info("Generation stopped by user.")
                return full_response.strip()

Code Block Features

All code blocks in responses include helpful features:

Syntax Highlighting

Code is automatically highlighted using highlight.js:
templates/index.html
messageDiv.querySelectorAll('pre code').forEach(block => {
    hljs.highlightElement(block);
});

Copy to Clipboard

Each code block has a copy button in the top-right corner:
templates/index.html
function copyToClipboard(codeBlock) {
    const text = codeBlock.textContent;
    if (navigator.clipboard) {
        navigator.clipboard.writeText(text)
            .then(() => {
                console.log('Code copied to clipboard!');
            })
            .catch(err => {
                console.error('Failed to copy text: ', err);
            });
    }
}

Theme Customization

Quest supports three themes that persist across sessions:
templates/index.html
:root[data-theme="light"] {
    --bg-primary: #f5f7fb;
    --bg-secondary: #ffffff;
    --text-primary: #1e293b;
    --text-secondary: #64748b;
    --border-color: #e2e8f0;
    --accent-color: #2563eb;
}
Theme preference is stored in localStorage:
templates/index.html
let currentTheme = localStorage.getItem('theme') || 'light';
let currentMode = localStorage.getItem('mode') || 'general';

Keyboard Shortcuts

Key CombinationAction
EnterSend query
Shift+EnterNew line in input
Esc(Future) Stop generation

API Endpoints

The web interface uses these Flask endpoints:
EndpointMethodPurpose
/GETRender main page
/searchPOSTSubmit query and get response
/stopPOSTStop ongoing generation
/clear_historyPOSTClear conversation history
/get_historyGETRetrieve conversation history
/set_modePOSTSwitch between general/reasoning

Example API Call

curl -X POST http://localhost:5000/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Explain the Two Sum problem",
    "mode": "general"
  }'
Response:
{
  "response": "Generated Solution:\n\nThe Two Sum problem asks..."
}

Troubleshooting

Page Won’t Load

Issue: Browser shows “Unable to connect”Solution: Ensure Flask is running:
python app.py

No Response to Queries

Issue: Loading spinner never completesSolutions:
  1. Check Ollama is running: curl http://localhost:11434/api/version
  2. Check browser console for errors (F12)
  3. Verify the model is downloaded: ollama list

History Not Persisting

History is stored in memory, not a database. Restarting the Flask app clears all history.To persist history, implement a database backend in memory_buffer.py.

Next Steps

Query Optimization

Learn how to write effective queries

Metadata Filtering

Filter solutions by difficulty, topics, and companies

Build docs developers (and LLMs) love