Knowledge base and RAG overview

What is a knowledge base?

A knowledge base in Iqra AI is a structured collection of documents that enables your agents to access and retrieve relevant information during conversations. Using Retrieval-Augmented Generation (RAG), agents can ground their responses in your organization’s specific knowledge, reducing hallucinations and providing accurate, contextual answers.

Knowledge bases are particularly powerful for customer support, technical documentation, and domain-specific assistance where accurate information retrieval is critical.

How RAG works in Iqra AI

The RAG pipeline in Iqra AI follows a sophisticated multi-stage process:

Document ingestion

Documents are uploaded and processed through extractors that support multiple formats (PDF, text, and more). The system uses the Unstructured API for complex document parsing.

Text chunking

Documents are split into manageable chunks using configurable strategies:

General chunking: Splits text by delimiter with configurable overlap
Parent-child chunking: Creates hierarchical chunks for better context preservation

Embedding generation

Each chunk is converted into a vector embedding using your configured embedding provider (currently supports Google Gemini). Embeddings are cached to improve performance and reduce API costs.

Vector storage

Embeddings are stored in Milvus, a high-performance vector database designed for similarity search at scale. Collections are dynamically loaded and unloaded based on usage.

Retrieval

When an agent receives a query, the system:

Generates an embedding for the query
Searches the vector database for similar chunks
Optionally searches keyword indices for exact matches
Combines results using hybrid search strategies

Post-processing

Retrieved chunks undergo reranking and reordering to optimize relevance. The system supports:

Rerank models for improved precision
Lost-in-the-middle reordering to combat position bias
Score thresholding to filter low-quality results

Context injection

The final curated context is injected into the agent’s prompt, enabling it to generate responses grounded in your knowledge base.

Key features

Multiple retrieval strategies

Iqra AI supports three retrieval approaches:

Vector search: Semantic similarity using embeddings
Full-text search: Keyword-based matching for exact terms
Hybrid search: Combines both approaches for maximum recall

Intelligent chunking

Choose between chunking strategies based on your content:

General chunking

Simple recursive text splitting with configurable chunk size and overlap. Ideal for uniformly structured content.

Parent-child chunking

Hierarchical chunking that retrieves small chunks but provides larger parent context. Better for complex documents.

Embedding cache

The system automatically caches embeddings to:

Reduce API calls to embedding providers
Improve query latency
Lower operational costs

Embedding cache is particularly effective for common queries and can significantly reduce costs in high-traffic scenarios.

Dynamic collection management

Milvus collections are automatically loaded into memory when needed and released after a configurable expiry period. This ensures optimal memory usage while maintaining fast query performance.

Benefits for agents

When you link a knowledge base to an agent, it gains several capabilities:

Accurate responses: Answers are grounded in verified information rather than model training data
Source attribution: Each response can cite specific documents and chunks
Domain expertise: Agents can handle specialized topics without fine-tuning
Up-to-date information: Knowledge bases can be updated without retraining models
Reduced hallucinations: Retrieval constrains the agent to factual information

Architecture overview

The knowledge base system is built on several key components:

KnowledgeBaseRetrievalManager: Orchestrates the entire retrieval pipeline
RAGRetrievalService: Handles vector and keyword search operations
RAGDataPostProcessor: Applies reranking and filtering to results
EmbeddingProviderManager: Manages embedding model integrations
MilvusKnowledgeBaseClient: Interfaces with the Milvus vector database
RAGKeywordStore: Provides full-text search capabilities

Knowledge bases require proper infrastructure setup including MongoDB for metadata, Milvus for vectors, and Redis for caching. See the setup guide for deployment details.

Next steps

Setup guide

Learn how to create and configure your first knowledge base

Embedding providers

Configure embedding model integrations

Retrieval strategies

Optimize retrieval configuration for your use case

Getting Started

Core Concepts

Building Agents

Integrations

Knowledge Base & RAG

Deployment

Channels

Knowledge base and RAG overview

What is a knowledge base?

How RAG works in Iqra AI

Key features

Multiple retrieval strategies

Intelligent chunking

General chunking

Parent-child chunking

Embedding cache

Dynamic collection management

Benefits for agents

Architecture overview

Next steps

Setup guide

Embedding providers

Retrieval strategies

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Building Agents

Integrations

Knowledge Base & RAG

Deployment

Channels

​What is a knowledge base?

​How RAG works in Iqra AI

​Key features

​Multiple retrieval strategies

​Intelligent chunking

General chunking

Parent-child chunking

​Embedding cache

​Dynamic collection management

​Benefits for agents

​Architecture overview

​Next steps

Setup guide

Embedding providers

Retrieval strategies

Build docs developers (and LLMs) love

What is a knowledge base?

How RAG works in Iqra AI

Key features

Multiple retrieval strategies

Intelligent chunking

Embedding cache

Dynamic collection management

Benefits for agents

Architecture overview

Next steps