What is a knowledge base?
A knowledge base in Iqra AI is a structured collection of documents that enables your agents to access and retrieve relevant information during conversations. Using Retrieval-Augmented Generation (RAG), agents can ground their responses in your organization’s specific knowledge, reducing hallucinations and providing accurate, contextual answers.Knowledge bases are particularly powerful for customer support, technical documentation, and domain-specific assistance where accurate information retrieval is critical.
How RAG works in Iqra AI
The RAG pipeline in Iqra AI follows a sophisticated multi-stage process:Document ingestion
Documents are uploaded and processed through extractors that support multiple formats (PDF, text, and more). The system uses the Unstructured API for complex document parsing.
Text chunking
Documents are split into manageable chunks using configurable strategies:
- General chunking: Splits text by delimiter with configurable overlap
- Parent-child chunking: Creates hierarchical chunks for better context preservation
Embedding generation
Each chunk is converted into a vector embedding using your configured embedding provider (currently supports Google Gemini). Embeddings are cached to improve performance and reduce API costs.
Vector storage
Embeddings are stored in Milvus, a high-performance vector database designed for similarity search at scale. Collections are dynamically loaded and unloaded based on usage.
Retrieval
When an agent receives a query, the system:
- Generates an embedding for the query
- Searches the vector database for similar chunks
- Optionally searches keyword indices for exact matches
- Combines results using hybrid search strategies
Post-processing
Retrieved chunks undergo reranking and reordering to optimize relevance. The system supports:
- Rerank models for improved precision
- Lost-in-the-middle reordering to combat position bias
- Score thresholding to filter low-quality results
Key features
Multiple retrieval strategies
Iqra AI supports three retrieval approaches:- Vector search: Semantic similarity using embeddings
- Full-text search: Keyword-based matching for exact terms
- Hybrid search: Combines both approaches for maximum recall
Intelligent chunking
Choose between chunking strategies based on your content:General chunking
Simple recursive text splitting with configurable chunk size and overlap. Ideal for uniformly structured content.
Parent-child chunking
Hierarchical chunking that retrieves small chunks but provides larger parent context. Better for complex documents.
Embedding cache
The system automatically caches embeddings to:- Reduce API calls to embedding providers
- Improve query latency
- Lower operational costs
Dynamic collection management
Milvus collections are automatically loaded into memory when needed and released after a configurable expiry period. This ensures optimal memory usage while maintaining fast query performance.Benefits for agents
When you link a knowledge base to an agent, it gains several capabilities:- Accurate responses: Answers are grounded in verified information rather than model training data
- Source attribution: Each response can cite specific documents and chunks
- Domain expertise: Agents can handle specialized topics without fine-tuning
- Up-to-date information: Knowledge bases can be updated without retraining models
- Reduced hallucinations: Retrieval constrains the agent to factual information
Architecture overview
The knowledge base system is built on several key components:- KnowledgeBaseRetrievalManager: Orchestrates the entire retrieval pipeline
- RAGRetrievalService: Handles vector and keyword search operations
- RAGDataPostProcessor: Applies reranking and filtering to results
- EmbeddingProviderManager: Manages embedding model integrations
- MilvusKnowledgeBaseClient: Interfaces with the Milvus vector database
- RAGKeywordStore: Provides full-text search capabilities
Next steps
Setup guide
Learn how to create and configure your first knowledge base
Embedding providers
Configure embedding model integrations
Retrieval strategies
Optimize retrieval configuration for your use case