Overview
The RAG (Retrieval-Augmented Generation) Chat API enables intelligent question-answering over uploaded documents. Documents are processed, chunked, embedded, and indexed for semantic search, then combined with Google Gemini AI for accurate responses. Base Path:/rag-chat/
Key Features
Document Upload
Upload PDF documents for indexing
Vector Search
Semantic similarity search using embeddings
AI Responses
Google Gemini-powered contextual answers
Query History
Track and retrieve past queries
Endpoints
Upload Document
POST /rag-chat/api/upload/
Upload and process a PDF document for RAG indexing
/rag-chat/api/upload/Method:
POSTContent-Type:
multipart/form-dataAuth: Optional (user tracked if authenticated) Implemented in
rag_chat/views.py:105 as UploadDocumentView.
PDF file to upload (must have .pdf extension)
Display name for the document (defaults to filename)
Processing Pipeline
- Validation - Checks file type (PDF only)
- Database Record - Creates
DocumentCollectionwith statusprocessing - Text Extraction - Extracts text from PDF pages
- Chunking - Splits text into 500-char chunks with 50-char overlap
- Embeddings - Generates vector embeddings for each chunk
- Indexing - Stores chunks and embeddings in database
- Status Update - Sets status to
indexedorerror
Example Request
Success Response (201)
Unique identifier for the uploaded document collection
Document title
Processing status:
indexed, processing, pending, or errorNumber of text chunks created from the document
Total pages in the PDF document
Error Response (400)
Error Response (500)
Query RAG
POST /rag-chat/api/query/
Ask a question and get an AI-powered answer based on uploaded documents
/rag-chat/api/query/Method:
POSTContent-Type:
application/jsonAuth: Optional (query saved if authenticated) Implemented in
rag_chat/views.py:190 as QueryRAGView.
ID of the document collection to query
User’s question (cannot be empty)
Number of relevant chunks to retrieve (1-10 recommended)
How It Works
- Query Embedding - Converts user question to vector
- Similarity Search - Finds top-k most relevant document chunks
- Context Building - Constructs prompt with relevant passages
- AI Generation - Calls Google Gemini API with context
- Response - Returns answer with source citations
Example Request
Success Response (200)
AI-generated answer based on document context
Array of source chunks used to generate the answer
Excerpt from the relevant document chunk (truncated to 200 chars)
Page number(s) where this content appears
Similarity score (0-1) indicating relevance
Title of the document collection queried
Error Responses
400 - Bad RequestList Documents
GET /rag-chat/api/documents/
Retrieve all uploaded document collections
/rag-chat/api/documents/Method:
GETAuth: Optional (returns all documents) Implemented in
rag_chat/views.py:272 as ListDocumentsView.
Example Request
Success Response (200)
Array of document collection objects
Document collection ID
Document title
Processing status:
pending, processing, indexed, or errorNumber of pages in the document
Number of indexed chunks
ISO 8601 timestamp of upload
Error message if status is
errorDelete Document
DELETE /rag-chat/api/document/<int:collection_id>/
Remove a document collection and all associated chunks
/rag-chat/api/document/<collection_id>/Method:
DELETEAuth: Optional Implemented in
rag_chat/views.py:300 as DeleteDocumentView.
ID of the document collection to delete
Deleting a collection cascades to delete all associated
DocumentChunk records and removes the physical file from storage.Example Request
Success Response (200)
Confirmation message
Error Responses
404 - Not FoundQuery History
GET /rag-chat/api/history/
Retrieve past RAG queries (for all users or current user)
/rag-chat/api/history/Method:
GETAuth: Optional (returns all history) Implemented in
rag_chat/views.py:333 as QueryHistoryView.
Maximum number of queries to return
Example Request
Success Response (200)
Array of past query objects
Query record ID
User’s original question
AI-generated response (truncated to 200 chars)
Title of the document collection queried
ISO 8601 timestamp of the query
Data Models
DocumentCollection
Defined inrag_chat/models.py:6
| Field | Type | Description |
|---|---|---|
id | Integer | Primary key |
user | ForeignKey | Uploader (nullable) |
title | String | Document name |
file | FileField | PDF file path |
file_type | String | File format (default: pdf) |
page_count | Integer | Total pages |
chunk_count | Integer | Total chunks indexed |
status | String | pending, processing, indexed, error |
error_message | Text | Error details |
created_at | DateTime | Upload timestamp |
updated_at | DateTime | Last modification |
DocumentChunk
Defined inrag_chat/models.py:37
| Field | Type | Description |
|---|---|---|
id | Integer | Primary key |
collection | ForeignKey | Parent document |
chunk_index | Integer | Position in document |
content | Text | Chunk text |
embedding | JSONField | Vector embedding (list of floats) |
metadata | JSONField | Page numbers, sections, etc. |
created_at | DateTime | Creation timestamp |
RAGQuery
Defined inrag_chat/models.py:74
| Field | Type | Description |
|---|---|---|
id | Integer | Primary key |
user | ForeignKey | Querying user |
collection | ForeignKey | Document queried |
query | Text | User question |
response | Text | AI-generated answer |
chunks_used | JSONField | Array of chunk IDs |
created_at | DateTime | Query timestamp |
Technical Details
Embeddings Generation
The system uses sentence transformers for generating embeddings. Seerag_chat/embeddings.py for implementation.
Vector Search
Database-based cosine similarity search viaDatabaseVectorStore in rag_chat/vector_store.py:
Google Gemini Integration
Implemented inrag_chat/views.py:27 as _call_google_api_with_context().
Model: gemini-2.0-flashEndpoint:
https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent
Document Processing
Chunk Size: 500 charactersOverlap: 50 characters
Supported Formats: PDF only See
rag_chat/document_loader.py for the DocumentLoader implementation.
Example Workflow
Error Handling
Common Issues
'GOOGLE_API_KEY no configurada'
'GOOGLE_API_KEY no configurada'
Cause: Missing environment variable
Solution: Set
Solution: Set
GOOGLE_API_KEY in .env or environment'Solo PDFs soportados'
'Solo PDFs soportados'
Cause: Uploaded file is not a PDF
Solution: Convert document to PDF format
Solution: Convert document to PDF format
'Colección no lista (status: processing)'
'Colección no lista (status: processing)'
Cause: Document still being indexed
Solution: Wait for processing to complete, then retry
Solution: Wait for processing to complete, then retry
'No encontré información relevante'
'No encontré información relevante'
Cause: No chunks matched the query above threshold
Solution: Rephrase query or upload more comprehensive documents
Solution: Rephrase query or upload more comprehensive documents
Next Steps
Authentication
Learn about user authentication
API Overview
Explore other API endpoints
