Database setup

Overview

SKU Semantic Search uses PostgreSQL with the pgvector extension to store and search product embeddings. This setup enables efficient vector similarity search using SQL queries.

Why PostgreSQL + pgvector?

Unified storage

Store both relational data (products, users) and vectors in one database. No separate vector database needed.

ACID compliance

Full transactional support ensures data consistency across product updates and searches.

SQL interface

Use familiar SQL syntax with SQLAlchemy ORM. No need to learn new query languages.

Cost-effective

No additional infrastructure or licensing costs. Works with standard PostgreSQL.

Docker setup (recommended)

The easiest way to get PostgreSQL with pgvector is using the official Docker image.

Docker Compose configuration

The project includes a complete Docker Compose setup (docker-compose.yml:18):

services:
  db:
    image: pgvector/pgvector:pg16
    container_name: retail_rag_db
    ports:
      - "5435:5432"
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=db_pass
      - POSTGRES_DB=retail_rag
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:

Key configuration:

Image: pgvector/pgvector:pg16 - PostgreSQL 16 with pgvector pre-installed
Port mapping: 5435:5432 - External port 5435 avoids conflicts with local PostgreSQL
Persistent storage: postgres_data volume preserves data across container restarts
Health check: Ensures database is ready before starting dependent services

Start the database

Start PostgreSQL

docker-compose up -d db

This starts only the database service in detached mode.

Verify container is running

docker ps

Expected output:

CONTAINER ID   IMAGE                      STATUS                    PORTS
abc123         pgvector/pgvector:pg16     Up 30s (healthy)          0.0.0.0:5435->5432/tcp

Wait for the (healthy) status before proceeding.

Test connection

docker exec -it retail_rag_db psql -U postgres -d retail_rag

You should see the PostgreSQL prompt:

psql (16.0)
Type "help" for help.

retail_rag=#

Verify pgvector extension

Inside the psql prompt:

SELECT * FROM pg_extension WHERE extname = 'vector';

If the extension isn’t installed yet, it will be created automatically when the application starts.

Manual PostgreSQL setup

If you prefer to use an existing PostgreSQL installation:

Ubuntu/Debian
macOS
Windows

Install PostgreSQL

sudo apt update
sudo apt install postgresql postgresql-contrib

Install build dependencies

sudo apt install build-essential git postgresql-server-dev-16

Clone and build pgvector

cd /tmp
git clone --branch v0.7.0 https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install

Create database

sudo -u postgres psql

CREATE DATABASE retail_rag;
CREATE USER sku_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE retail_rag TO sku_user;
\q

Install PostgreSQL with Homebrew

brew install postgresql@16
brew services start postgresql@16

Install pgvector

brew install pgvector

Create database

psql postgres

CREATE DATABASE retail_rag;
CREATE USER sku_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE retail_rag TO sku_user;
\q

Install PostgreSQL

Download PostgreSQL installer from postgresql.org
Run the installer and follow the setup wizard
Note the superuser password you set

Install pgvector

Download pre-built binaries from pgvector releases and follow the Windows installation instructions.

Create database

Open pgAdmin or use psql:

CREATE DATABASE retail_rag;
CREATE USER sku_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE retail_rag TO sku_user;

Database initialization

The application automatically creates tables and enables the pgvector extension on startup (app/main.py:8):

def create_tables():
    with engine.connect() as conn:
        conn.execute(text("CREATE EXTENSION IF NOT EXISTS vector"))
        conn.commit()
    Base.metadata.create_all(bind=engine)

What happens:

Connects to PostgreSQL using DATABASE_URL from .env
Creates the vector extension if not already present
Creates all tables defined by SQLAlchemy models

The IF NOT EXISTS clause makes this operation idempotent - safe to run multiple times.

Product model and schema

Products are stored with their embeddings in the products table (app/models/product.py:5):

from sqlalchemy import Column, Integer, String, Text
from pgvector.sqlalchemy import Vector
from app.db.session import Base

class Product(Base):
    __tablename__ = "products"

    id = Column(Integer, primary_key=True, index=True)
    name = Column(String, index=True, nullable=False)
    description = Column(Text, nullable=True)
    category = Column(String, index=True)
    
    # Vector of 3072 dimensions for Gemini embeddings
    embedding = Column(Vector(3072))

Column details:

id: Auto-incrementing primary key
name: Product name, indexed for fast lookups
description: Full product description
category: Product category, indexed for filtering
embedding: 3072-dimensional vector from Gemini’s embedding model

Generated SQL

SQLAlchemy creates this SQL:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR NOT NULL,
    description TEXT,
    category VARCHAR,
    embedding vector(3072)
);

CREATE INDEX ix_products_name ON products(name);
CREATE INDEX ix_products_category ON products(category);

Vector operations

Creating embeddings

When a product is added, its embedding is generated and stored (app/services/product_service.py:8):

@staticmethod
def create_product(db: Session, product_in: ProductCreate):
    # 1. Generate embedding from description
    text_to_embed = f"{product_in.name}: {product_in.description}"
    embedding = LLMService.get_embedding(text_to_embed)

    # 2. Save to database
    db_product = Product(
        name=product_in.name,
        description=product_in.description,
        category=product_in.category,
        embedding=embedding
    )
    db.add(db_product)
    db.commit()
    db.refresh(db_product)
    return db_product

Vector similarity search

Searching uses cosine distance to find similar products (app/services/product_service.py:26):

@staticmethod
def search_products(db: Session, query: str, limit: int = 5):
    # 1. Get query embedding
    query_embedding = LLMService.get_embedding(query)
    
    # 2. Search using pgvector
    products = db.query(Product).order_by(
        Product.embedding.cosine_distance(query_embedding)
    ).limit(limit).all()
    
    return products

Distance functions:

cosine_distance: Measures angular similarity (recommended for text)
l2_distance: Euclidean distance
inner_product: Dot product (for normalized vectors)

Cosine distance is scale-invariant, making it ideal for embeddings where magnitude varies but direction matters.

Performance optimization

For large product catalogs, add vector indexes:

IVFFlat index

Fast approximate search (recommended for 10,000+ products):

-- Create index with 100 lists
CREATE INDEX ON products USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- Analyze table for better query planning
ANALYZE products;

Configuration:

lists: Number of clusters (rule of thumb: rows / 1000)
Trade accuracy for speed
Requires ANALYZE after creation

HNSW index

Higher recall, more storage (recommended for < 1M products):

CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Configuration:

m: Max connections per layer (higher = better recall, more memory)
ef_construction: Build-time accuracy (higher = better index, slower builds)

When to use IVFFlat vs HNSW

IVFFlat:

Faster builds (good for frequently changing catalogs)
Lower memory usage
Slightly lower recall
Best for 10K - 10M vectors

HNSW:

Higher recall (more accurate results)
Faster queries
More memory intensive
Best for < 1M vectors with stable catalogs

Index rebuild frequency

IVFFlat: Rebuild when:

Product count changes by >20%
Search quality degrades
Run: REINDEX INDEX index_name;

HNSW: More resilient to updates, rebuild only when:

Adding bulk products (>50% growth)
Major catalog changes

Database maintenance

Backup and restore

# Backup entire database
docker exec retail_rag_db pg_dump -U postgres retail_rag > backup.sql

# Backup with compression
docker exec retail_rag_db pg_dump -U postgres retail_rag | gzip > backup.sql.gz

Monitoring

-- Check table size
SELECT 
    pg_size_pretty(pg_total_relation_size('products')) AS total_size,
    pg_size_pretty(pg_relation_size('products')) AS table_size,
    pg_size_pretty(pg_indexes_size('products')) AS indexes_size;

-- Count products by category
SELECT category, COUNT(*) FROM products GROUP BY category;

-- Check for missing embeddings
SELECT COUNT(*) FROM products WHERE embedding IS NULL;

Vacuum and analyze

-- Reclaim space and update statistics
VACUUM ANALYZE products;

-- Full vacuum (locks table)
VACUUM FULL products;

VACUUM FULL locks the table and can take significant time on large datasets. Run during maintenance windows.

Connection pooling

SQLAlchemy handles connection pooling automatically (app/db/session.py:6):

engine = create_engine(settings.DATABASE_URL)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

For production, configure pool settings:

engine = create_engine(
    settings.DATABASE_URL,
    pool_size=10,           # Number of persistent connections
    max_overflow=20,        # Additional connections when pool is full
    pool_timeout=30,        # Seconds to wait for connection
    pool_recycle=3600,      # Recycle connections after 1 hour
)

Troubleshooting

pgvector extension not found

Error: extension "vector" does not existSolution:

Verify you’re using the pgvector Docker image: pgvector/pgvector:pg16
For manual installations, check pgvector is installed: SELECT * FROM pg_available_extensions WHERE name = 'vector';
Ensure database user has CREATE privilege: GRANT CREATE ON DATABASE retail_rag TO postgres;

Connection refused

Error: psycopg2.OperationalError: could not connect to serverSolution:

Check PostgreSQL is running: docker ps
Verify port in DATABASE_URL matches Docker config (5435 for external)
For Docker API, use service name db as host, not localhost
Check firewall rules allow connections to port 5435

Slow vector searches

Symptoms: Queries take >1 second with 10,000+ productsSolution:

Create a vector index (IVFFlat or HNSW)
Run ANALYZE products; to update statistics

Increase work_mem for better sort performance:

ALTER DATABASE retail_rag SET work_mem = '256MB';

Consider sharding if you have millions of products

Dimension mismatch error

Error: dimension of vector must matchCause: Embedding dimension doesn’t match model definitionSolution:

Ensure all embeddings use the same model (Gemini: 3072 dimensions)
If changing models, recreate embeddings:
```
UPDATE products SET embedding = NULL;
```
Re-run seed script to regenerate embeddings

Next steps

Environment setup

Configure DATABASE_URL and other environment variables

Docker deployment

Deploy the complete stack with Docker Compose

Architecture overview

Understand how the database fits into the system

RAG pattern

Learn how vector search powers RAG

Get Started

Architecture

Guides

Overview

Why PostgreSQL + pgvector?

Unified storage

ACID compliance

SQL interface

Cost-effective

Docker setup (recommended)

Docker Compose configuration

Start the database

Manual PostgreSQL setup

Database initialization

Product model and schema

Generated SQL

Vector operations

Creating embeddings

Vector similarity search

Performance optimization

IVFFlat index

HNSW index

Database maintenance

Backup and restore

Monitoring

Vacuum and analyze

Connection pooling

Troubleshooting

Next steps

Environment setup

Docker deployment

Architecture overview

RAG pattern

Build docs developers (and LLMs) love

Get Started

Architecture

Guides

​Overview

​Why PostgreSQL + pgvector?

Unified storage

ACID compliance

SQL interface

Cost-effective

​Docker setup (recommended)

​Docker Compose configuration

​Start the database

​Manual PostgreSQL setup

​Database initialization

​Product model and schema

​Generated SQL

​Vector operations

​Creating embeddings

​Vector similarity search

​Performance optimization

​IVFFlat index

​HNSW index

​Database maintenance

​Backup and restore

​Monitoring

​Vacuum and analyze

​Connection pooling

​Troubleshooting

​Next steps

Environment setup

Docker deployment

Architecture overview

RAG pattern

Build docs developers (and LLMs) love

Overview

Why PostgreSQL + pgvector?

Docker setup (recommended)

Docker Compose configuration

Start the database

Manual PostgreSQL setup

Database initialization

Product model and schema

Generated SQL

Vector operations

Creating embeddings

Vector similarity search

Performance optimization

IVFFlat index

HNSW index

Database maintenance

Backup and restore

Monitoring

Vacuum and analyze

Connection pooling

Troubleshooting

Next steps