Overview
The Incident Resolution Agent is an AI-powered assistant designed to help teams quickly resolve incidents by searching through a knowledge base of past incidents. Built with LangGraph and powered by large language models, the agent provides intelligent, context-aware responses to incident-related queries.What is the Agent?
The agent is a conversational AI system that:- Searches Past Incidents: Retrieves relevant historical incident data using semantic search
- Provides Contextual Answers: Generates responses based on verified knowledge and tool results
- Learns from Examples: Uses golden examples to improve response quality
- Maintains Conversation State: Remembers context across multiple turns in a conversation
Key Capabilities
Intelligent Query Understanding
The agent analyzes user queries and automatically determines the best approach:- Recognizes specific incident IDs (e.g.,
INC-2025-08-24-001) - Understands problem descriptions and error messages
- Identifies application/system-specific queries
- Interprets time-based requests (e.g., “incidents from last week”)
Multi-Tool Search
The agent has access to four specialized search tools:lookup_incident_by_id
Fast direct lookup for specific incident IDs
search_similar_incidents
Semantic similarity search across all incident reports
get_incidents_by_application
Find incidents affecting specific applications or systems
get_recent_incidents
Retrieve incidents from a specific timeframe
Query Rewriting
Before calling tools, the agent automatically rewrites conversational queries into optimized search terms:Golden Example Enhancement
The agent searches for similar past conversations (golden examples) and uses them to improve response quality. This ensures consistent, high-quality answers based on verified resolutions.Conversation Memory
Using PostgreSQL-backed checkpointing, the agent maintains conversation state:- Remembers previous messages in the conversation
- Can answer follow-up questions without re-searching
- Maintains session continuity across API calls
Use Cases
Incident Research
Problem Resolution
Specific Incident Lookup
Application Monitoring
Architecture Highlights
The agent is built on LangGraph, a framework for building stateful, multi-actor applications with LLMs. Learn more about the architecture in the Architecture section.
Key Components
- State Management: TypedDict-based state schema with message history
- Node Functions: Modular processing units (LLM calls, tool execution, title generation)
- Conditional Routing: Smart decision-making about when to use tools
- Persistent Checkpointing: PostgreSQL-backed conversation memory
Technology Stack
- LangGraph: State graph orchestration
- LangChain: LLM integration and tool binding
- Qdrant: Vector database for semantic search
- PostgreSQL: Conversation state persistence
- Langfuse: Optional observability and tracing
Configuration
The agent supports multiple LLM providers:- Anthropic (Claude models)
- OpenAI (GPT models)
- Google (Gemini models)
- Custom (Any OpenAI-compatible API)
- Ollama (Local models)
src/copilot/config.py:51:
Response Guidelines
The agent follows strict response guidelines:- Cite Sources: Always references incident IDs when using tool data
- No Fabrication: Never makes up information
- Concise Output: Provides clear, factual responses
- Table Limits: Never generates tables with more than 4 columns
Next Steps
Architecture
Explore the LangGraph state machine and node structure
Workflow
Understand how queries are processed end-to-end