Skip to main content
Hero Light SKU Semantic Search is a proof-of-concept FastAPI service that demonstrates how to build an intelligent product search system using Retrieval-Augmented Generation (RAG), vector embeddings, and multi-LLM failover. Unlike traditional keyword-based search that requires exact matches, this system understands user intent. For example, searching for “something to clean the floor” will return relevant products like mops and cleaning solutions, even if those exact words aren’t in the product descriptions.

Key features

Semantic search

Vector similarity search using Google Gemini embeddings and PostgreSQL pgvector extension

RAG pattern

Retrieval-Augmented Generation ensures AI recommendations are grounded in real product data

Multi-LLM failover

Automatic failover from Google Gemini to Anthropic Claude if the primary provider is unavailable

Docker deployment

Production-ready Docker Compose configuration with PostgreSQL and pgvector

How it works

The system transforms text queries into 3072-dimensional vectors using Google Gemini’s embedding model. These vectors are stored and searched using PostgreSQL’s pgvector extension, which uses cosine distance to find the most semantically similar products. Once relevant products are retrieved, the RAG pattern feeds them as context to an LLM (Gemini or Claude), which generates a natural language recommendation based only on the retrieved products, eliminating hallucinations.
This project was built as a learning exercise to explore AI integrations, vector databases, and API development with Python.

Technology stack

  • Backend: FastAPI 0.134+, Python 3.13, SQLAlchemy 2.0
  • Database: PostgreSQL with pgvector extension
  • AI providers: Google Generative AI (Gemini), Anthropic (Claude)
  • Authentication: JWT-based with python-jose and passlib
  • Deployment: Docker and Docker Compose

Get started

Quickstart

Get a working example running in under 5 minutes

Installation

Detailed setup instructions for local development

Architecture overview

Understand how the system components work together

Environment setup

Configure environment variables and API keys

Build docs developers (and LLMs) love