all-MiniLM-L6-v2 model and stored in Qdrant.
Search functionality is implemented in the copilot service layer. The knowledge base router focuses on data management. For search capabilities, incidents are retrieved from Qdrant using the LangChain integration with vector similarity search.
How Search Works
Embedding Model
Incidents are embedded using HuggingFace’sall-MiniLM-L6-v2 model:
- Embedding size: 384 dimensions
- Distance metric: Cosine similarity
- Normalization: Embeddings are normalized for consistent similarity scores
Document Chunking
Long incident descriptions are split into chunks to improve retrieval accuracy:- Chunk size: 1000 characters
- Chunk overlap: 200 characters
- Metadata preservation: Each chunk retains full incident metadata
Searchable Fields
The following incident metadata fields are indexed and searchable:Unique incident identifier
Incident title or summary
Affected application or service
Root cause analysis
Mitigation and resolution steps
Team or person responsible
Source system (e.g., “ServiceNow”)
Whether this is a repeat incident
ISO 8601 timestamp when incident was opened
ISO 8601 timestamp when incident was last updated
Integration with Qdrant
The knowledge base uses Qdrant as the vector database:Collection Configuration
- Collection name:
past_issues_v2(configurable per version) - Vector size: 384 (matches embedding model)
- Distance: Cosine
- Payload structure: LangChain Document format
LangChain Document Format
Each vector point in Qdrant follows this structure:Example: Using Qdrant Client for Search
While search is typically handled by the copilot service, you can query the vector database directly:Example: Filtering by Metadata
Retrieval Performance
Batch Ingestion
Incidents are ingested in batches for optimal performance:- Default batch size: 5 incidents per batch
- Concurrent operations: Batches run sequentially with progress tracking
- Error handling: Individual batch failures don’t stop the entire ingestion
Search Performance
- Typical latency: Less than 100ms for top-k retrieval (k=5)
- Scaling: Qdrant handles millions of vectors efficiently
- Caching: Embeddings are cached at the model level
Metadata Parsing
For ServiceNow incidents, the system automatically extracts structured metadata from description fields:impactedApplication→impacted_applicationrootCause→root_causemitigation→mitigationaccountableParty→accountable_partyrepeatIncident→repeat_incident