Prerequisites
Before creating a knowledge base, ensure you have:- A configured embedding provider integration (see Embedding providers)
- Access to the Iqra AI platform or self-hosted instance
- Documents to upload (PDF, TXT, or other supported formats)
Self-hosted deployments require Milvus, MongoDB, and Redis to be properly configured. See the deployment guide for infrastructure setup.
Creating a knowledge base
Navigate to knowledge bases
From your business dashboard, access the knowledge base management section.
Create new knowledge base
Click Create Knowledge Base and provide:
- Name: A descriptive name for the knowledge base
- Description: Optional description of the content and purpose
Configure chunking strategy
Choose how documents will be split into chunks:
General chunking
Best for uniformly structured content like articles or documentation.Parent-child chunking
Better for complex documents where context is crucial.Parent-child chunking retrieves child chunks but provides parent context to the agent, improving answer quality while maintaining retrieval precision.
Configure embedding
Select your embedding integration and model:
- Integration: Choose from configured embedding providers
- Model: Select the embedding model (e.g.,
text-embedding-004) - Vector dimension: Set based on model specifications
Uploading documents
Once your knowledge base is created:Add documents
Click Upload Documents and select files from your computer. Supported formats include:
- PDF (via PDF extractor or Unstructured API)
- Plain text (.txt)
- Other formats supported by Unstructured API
Processing
Documents are processed asynchronously:
- Text extraction
- Cleaning and preprocessing
- Chunking based on your strategy
- Embedding generation
- Storage in Milvus and MongoDB
- Keyword index creation
The system generates embeddings in batches to optimize API usage. Embedding costs are determined by your provider’s pricing model.
Linking to agents
To enable an agent to use the knowledge base:Add knowledge base link
In the Knowledge Base section:
- Click Link Knowledge Base
- Select the knowledge base from the dropdown
- Configure search strategy (if different from defaults)
Configure search strategy
Optionally override retrieval settings for this agent:
- TopK: Number of chunks to retrieve per query
- Search refinement: Additional filtering or boosting
Managing documents
Updating documents
When updating an existing document:- Upload the new version with the same filename
- The system will:
- Delete old chunks from Milvus and keyword store
- Reprocess the new content
- Generate fresh embeddings
- Update vector and keyword indices
Deleting documents
To remove a document:- Select the document in the knowledge base
- Click Delete
- Confirm deletion
- Remove all chunks from Milvus
- Delete keyword index entries
- Clean up metadata in MongoDB
Disabling documents
Temporarily disable documents without deleting:- Toggle the document’s Enabled status
- Disabled chunks are excluded from retrieval
- Re-enable anytime to restore access
Performance optimization
Chunk size tuning
Optimal chunk size depends on your use case:- Small chunks (256-512 chars): Better precision, may lack context
- Medium chunks (512-1024 chars): Balanced approach, recommended default
- Large chunks (1024-2048 chars): More context, may dilute relevance
Overlap configuration
Chunk overlap prevents information loss at boundaries:- No overlap (0): Faster processing, risk of split concepts
- Low overlap (20-50 chars): Minimal redundancy, good for most content
- High overlap (100-200 chars): Maximum continuity, increased storage
Collection management
Milvus collections are automatically managed:- Loaded into memory when agents query them
- Released after configurable idle period (default: 1 hour)
- Reloaded on-demand with minimal latency
Frequently accessed knowledge bases remain in memory, while rarely used ones are unloaded to conserve resources.
Monitoring and maintenance
Health checks
Regularly verify:- All documents show Indexed status
- No failed embedding generation jobs
- Milvus collections are accessible
- Embedding cache hit rate is reasonable
Updating embeddings
If you change embedding providers:- Update the knowledge base configuration
- Trigger full re-indexing of all documents
- Monitor progress in the processing queue
- Verify retrieval quality with test queries
Next steps
Embedding providers
Configure and optimize embedding integrations
Retrieval strategies
Learn about retrieval configuration and optimization