System Requirements
Python
Python 3.8 or higher required
Memory
Minimum 4GB RAM recommended
Storage
At least 500MB for dependencies and vector database
API Key
Valid OpenAI API key with access to GPT models
Installation Steps
Create a virtual environment
Using Python’s built-in
venv module (recommended):Using a virtual environment isolates project dependencies and prevents conflicts with other Python projects.
Install dependencies
With your virtual environment activated, install all required packages:This installs the following key dependencies:
Core Framework
streamlit==1.49.1- Web application frameworkpython-dotenv==1.1.1- Environment variable management
LangChain Stack
langchain==0.3.27- RAG orchestration frameworklangchain-core==0.3.75- Core abstractionslangchain-openai==0.3.32- OpenAI integrationlangchain-chroma==0.2.5- ChromaDB vector storelangchain-community==0.3.29- Community integrationslangchain-text-splitters==0.3.11- Document chunking
AI/ML Services
openai==1.106.1- OpenAI API clientchromadb==1.0.20- Vector databasetiktoken==0.11.0- Token counting for OpenAI models
Document Processing
pypdf- PDF text extraction (via langchain-community dependencies)
Installation may take 2-5 minutes depending on your connection speed. The total size is approximately 200-300MB.
Configure environment variables
Create your environment configuration:Edit the The application loads these variables at startup:
.env file with your OpenAI API key:.env
app.py:10-13
Running the Application
Start the Streamlit server:http://localhost:8501.
Configuration Options
The app is configured with the following settings:app.py:88-91
app.py to change the browser tab title and icon.
Vector Database Setup
RAG Chat uses ChromaDB for persistent vector storage. On first run:- A
db/directory is automatically created in your project root - When you upload documents, embeddings are stored in this directory
- The vector store persists across application restarts
app.py:16-23
If you want to start fresh, simply delete the
db/ directory. It will be recreated when you upload new documents.Document Processing Configuration
The application processes PDFs with these default settings:app.py:33-36
chunk_size=1000- Each document chunk contains up to 1000 characterschunk_overlap=400- Consecutive chunks overlap by 400 characters to preserve context
app.py based on your document types.
Troubleshooting
Import Errors
If you seeModuleNotFoundError, ensure your virtual environment is activated:
API Key Issues
If you get authentication errors:- Verify your API key is correctly set in
.env - Check that
.envis in the project root directory - Restart the application after editing
.env - Verify your OpenAI account has available credits
ChromaDB Errors
If you encounter database corruption:Port Already in Use
If port 8501 is already in use:Upgrading Dependencies
To update to the latest compatible versions:Development Setup
For development work, you may want additional tools:Next Steps
Quickstart
Start asking questions about your documents
Configuration
Customize chunking, models, and embeddings
RAG Overview
Learn how the RAG pipeline works
API Reference
Detailed function documentation