Skip to main content
This guide covers everything you need to install and configure the RAG Support System for development or production use.

System requirements

Before installing, verify your environment meets these requirements:
  • Python 3.12+ (Python 3.11+ may work but 3.12+ is recommended)
  • uv package manager (recommended) or pip
  • Git for cloning the repository
  • Docker & Docker Compose (optional, for containerized deployment)

Verify Python version

python --version
# Python 3.12.0 or higher
Python 3.10 and below are not supported. Upgrade to 3.12+ before proceeding.

Install uv package manager

We recommend using uv for dependency management. It’s significantly faster than pip and handles lock files automatically.
curl -LsSf https://astral.sh/uv/install.sh | sh
Verify installation:
uv --version
# uv 0.1.0 or higher

Clone the repository

git clone https://github.com/JoAmps/rgt-assignment.git
cd rgt-assignment

Set up virtual environment

Create and activate a Python virtual environment:
python -m venv .venv
source .venv/bin/activate
Your prompt should now show (.venv) prefix, indicating the virtual environment is active.

Install dependencies

With your virtual environment activated, install all required packages:
uv sync
This reads pyproject.toml and installs:
  • FastAPI (0.128.0+) — Web framework
  • LangChain (1.2.7+) — LLM orchestration
  • LangChain integrations — OpenAI, Chroma, Unstructured
  • ChromaDB (1.4.1+) — Vector database
  • OpenAI (2.16.0+) — Embeddings and LLM client
  • scikit-learn (1.8.0+) — ML models
  • pandas (3.0.0+) — Data processing
  • pytest (9.0.2+) — Testing framework
  • Development tools — black, flake8, mypy, isort
uv sync creates a lock file for reproducible installs. Use uv sync --locked in CI/CD to ensure exact versions.

Dependencies reference

Here’s the complete dependency list from pyproject.toml:
pyproject.toml
[project]
name = "rgt"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
    "autoflake>=2.3.1",
    "autopep8>=2.3.2",
    "black>=26.1.0",
    "chromadb>=1.4.1",
    "fastapi>=0.128.0",
    "flake8>=7.3.0",
    "isort>=7.0.0",
    "joblib>=1.5.3",
    "langchain>=1.2.7",
    "langchain-chroma>=1.1.0",
    "langchain-openai>=1.1.7",
    "langchain-text-splitters>=1.1.0",
    "langchain-unstructured>=1.0.1",
    "mypy>=1.19.1",
    "openai>=2.16.0",
    "pandas>=3.0.0",
    "pytest>=9.0.2",
    "python-dotenv>=1.2.1",
    "scikit-learn>=1.8.0",
    "seaborn>=0.13.2",
    "unstructured>=0.18.31",
]

Configure environment variables

The RAG Support System requires API keys for OpenAI and Unstructured services.

Create .env file

Create a .env file in the project root:
touch .env
Add your API keys:
.env
OPENAI_API_KEY=sk-proj-...
UNSTRUCTURED_API_KEY=your_unstructured_api_key
Security best practices:
  • Never commit .env to version control
  • Add .env to .gitignore
  • Use different keys for development and production
  • Rotate keys regularly
  • Use secret management services (AWS Secrets Manager, Vault) in production

Get API keys

1

OpenAI API key

  1. Sign up at platform.openai.com
  2. Navigate to API keys section
  3. Create a new secret key
  4. Copy and paste into .env as OPENAI_API_KEY
The system uses:
  • text-embedding-3-small for embeddings (~$0.02 per 1M tokens)
  • gpt-4.1 for generation (configured in src/rag/retriever.py:35)
2

Unstructured API key

  1. Sign up at unstructured.io
  2. Get your API key from the dashboard
  3. Add to .env as UNSTRUCTURED_API_KEY
Unstructured parses markdown files into structured elements during document ingestion. Used in src.rag.ingest module.

Environment variable loading

The system loads environment variables using python-dotenv. Here’s how it works (from src/config.py):
config.py
from dotenv import load_dotenv
import os

load_dotenv()  # Reads .env from project root

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
UNSTRUCTURED_API_KEY = os.getenv("UNSTRUCTURED_API_KEY")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY not found in environment")

Verify installation

Run the test suite to verify everything is installed correctly:
pytest -q
Expected output:
........................                                                  [100%]
24 passed in 2.34s
Tests mock external services (OpenAI, Chroma, Unstructured) so they run offline without API keys.

Run specific tests

# Verbose output with print statements
pytest -s -v tests/

# Single test file
pytest tests/test_retriever.py -q

# Single test function
pytest tests/test_retriever.py::test_search_and_answer -v

Optional: Train ML models

The system includes triage ML models for category and priority prediction. Train them before making RAG queries:
uv run -m src.ml.train
This:
  • Reads training data from tickets_train.csv
  • Trains TF-IDF + Logistic Regression classifiers
  • Saves models to artifacts/category_model.pkl and artifacts/priority_model.pkl
  • Generates classification reports in reports/
Models are loaded automatically by the API at startup (see src/api/routes/triage_route.py:12-20).

Optional: Docker setup

For containerized deployment, use Docker Compose:
# Build and start containers
docker-compose up --build

# Run in detached mode
docker-compose up -d

# Stop containers
docker-compose down
The Docker setup includes:
  • FastAPI application container
  • Chroma vector database (if configured in docker-compose.yml)
  • Volume mounts for persistent data
Docker configuration is in Dockerfile and docker-compose.yml at project root.

Directory structure

After installation, your project should look like this:
rgt-assignment/
├── .venv/                    # Virtual environment (gitignored)
├── .env                      # API keys (gitignored)
├── src/
│   ├── api/                  # FastAPI routes, models, services
│   │   ├── routes/           # Endpoint definitions
│   │   ├── services/         # Business logic
│   │   └── models.py         # Pydantic request/response models
│   ├── rag/                  # RAG agent, ingestion, evaluation
│   │   ├── retriever.py      # Core RAG agent
│   │   ├── ingest.py         # Document ingestion
│   │   ├── prompts.py        # Prompt templates
│   │   └── evals.py          # Evaluation framework
│   ├── ml/                   # Triage ML models
│   │   ├── train.py          # Training script
│   │   └── predict.py        # Inference helper
│   └── config.py             # Environment configuration
├── tests/                    # Unit tests
├── artifacts/                # Trained models (created after training)
├── reports/                  # Evaluation outputs (created after evals)
├── chroma_db/                # Vector store data (created after ingestion)
├── kb_docs/                  # Knowledge base markdown files
├── main.py                   # FastAPI application entrypoint
├── pyproject.toml            # Dependencies and project metadata
├── README.md                 # Project overview
└── ARCHITECTURE.md           # System design documentation

Next steps

Follow the quickstart

Ingest documents and make your first RAG query

Understand the architecture

Learn how system components work together

Explore API endpoints

Read endpoint specifications and examples

Run evaluations

Test answer quality with offline metrics:
python -m src.rag.evals

Troubleshooting

If you encounter build errors, upgrade pip and setuptools:
pip install --upgrade pip setuptools wheel
Then retry uv sync.
Check your Python version matches requirements:
python --version
If using pyenv or conda, ensure you’ve selected Python 3.12+:
pyenv local 3.12.0
# or
conda activate py312
Ensure your virtual environment is activated:
source .venv/bin/activate  # macOS/Linux
.venv\Scripts\activate     # Windows
Verify packages are installed:
pip list | grep langchain
If you hit rate limits during testing:
  • Use a lower tier of OpenAI API key
  • Implement exponential backoff (already handled by LangChain)
  • Add time.sleep() between batch operations
  • Consider caching embeddings
If Chroma throws “database is locked” errors:
# Stop all Python processes
pkill -9 python

# Remove lock files
rm chroma_db/*.lock

# Restart API server
uv run main.py

Build docs developers (and LLMs) love