Prerequisites
Before you begin, ensure you have the following installed:- Python 3.10 or higher
- Conda (Anaconda or Miniconda)
- Git for cloning the repository
- DeepSeek API Key (or any OpenAI-compatible LLM API key)
DeenPAL uses the DeepSeek model via OpenRouter API. You can sign up for a free API key at OpenRouter.
Installation
Follow these steps to install and set up DeenPAL:Set up a virtual environment
Create and activate a new conda environment with Python 3.10:
Using a virtual environment ensures dependency isolation and prevents conflicts with other Python projects.
Configure environment variables
Create a
.env file in the root directory and add your OpenRouter API key:.env
DeenPAL uses OpenRouter to access the DeepSeek model. Get your API key from OpenRouter.
Install dependencies
Install the required Python packages. You can use either
pip or uv package manager:The
uv package manager is significantly faster than pip for dependency installation. If you’re installing frequently, consider using uv.Prepare your data
Place your Hadith documents in the Expected PDF naming format:
data/ directory in PDF format:The original implementation uses Sahih Muslim and Sahih Bukhari books (all volumes) as the data source. You can use any Hadith PDF files that follow a similar structure with chapter and book numbering.
<prefix>_Sahih_Bukhari_Volume_1.pdf<prefix>_Sahih_Muslim_Volume_1.pdf
Running DeenPAL
Once installation is complete, you can start the chatbot:Start the Streamlit application
Run the following command in your terminal:
The first run will take longer as the system loads PDFs, generates embeddings, and initializes the ChromaDB vector store. Subsequent runs will be much faster due to caching.
Access the chatbot interface
Open your web browser and navigate to:You should see the Deen Pal Chatbot interface.
Streamlit runs on port 8501 by default. If you need to use a different port, run
streamlit run app.py --server.port 8080.Ask your first question
Try asking a question in the chat input at the bottom of the page. For example:The chatbot will:
Example Query
“What does the Hadith say about prayer?”
- Retrieve relevant Hadiths from the database
- Display each Hadith with source citations (book number, hadith number, chapter)
- Provide a brief explanation for each Hadith
- Generate a concise answer to your question
Understanding the First Run
During the first execution, DeenPAL performs several initialization steps:The
@st.cache_resource decorator ensures this data loading happens only once per app session, significantly improving response times for subsequent queries.What Happens Behind the Scenes
When you submit a query, here’s what DeenPAL does:- Semantic Search: Your query is converted to an embedding and compared against the Hadith database
- MMR Retrieval: The system retrieves the top 4 diverse results from 10 candidates using Maximal Marginal Relevance
- Context Building: Retrieved Hadiths are formatted with their metadata
- LLM Generation: The DeepSeek model generates a response based on the retrieved context
- Response Display: The answer is shown with proper Hadith citations and explanations
chains.py:15-18:
Troubleshooting
Streamlit doesn't start on port 8080
Streamlit doesn't start on port 8080
By default, Streamlit uses port 8501. If you need to use port 8080, run:
API key not found error
API key not found error
Ensure your Make sure there are no extra spaces around the equals sign.
.env file is in the root directory (same level as app.py) and contains:No PDFs found in data/ directory
No PDFs found in data/ directory
Verify that:
- The
data/directory exists in the project root - Your PDF files are placed directly in the
data/directory - The PDF files are readable and not corrupted
Slow first run
Slow first run
This is expected behavior. The first run involves:
- Loading all PDF documents
- Downloading the HuggingFace embedding model (
sentence-transformers/all-MiniLM-L6-v2) - Generating embeddings for all chunks
- Initializing the ChromaDB vector store
Next Steps
Architecture
Learn about the technical architecture and how RAG works in DeenPAL.
Configuration
Customize the retrieval parameters, models, and prompts.
