Welcome to Sentinel AI
Sentinel AI is an autonomous DevOps agent designed to monitor, diagnose, and repair problems in Linux servers in real-time. It uses Generative AI (LLMs) powered by a RAG (Retrieval-Augmented Generation) system to make informed decisions based on official technical documentation.Sentinel AI operates with a human-in-the-loop approach for critical actions, ensuring safety while maintaining automation efficiency.
What is Sentinel AI?
Sentinel AI is an intelligent DevOps automation platform that combines:- Autonomous monitoring of critical services (Nginx, PostgreSQL, Docker, SSH)
- AI-powered diagnosis using RAG to retrieve relevant technical documentation
- Automated remediation with safety controls and approval workflows
- Real-time dashboard for monitoring agent activity and system status
Key Features
Autonomous Agent
LangGraph-powered decision engine that monitors services, diagnoses issues, plans fixes, and executes remediation automatically.
RAG Knowledge Base
Consults real technical manuals indexed in Pinecone (PostgreSQL, Nginx, Docker, Linux) with Cohere Rerank for accuracy.
Real-time Dashboard
Modern Next.js interface with live terminal output, service status monitoring, and interactive chat.
Safety Controls
Command whitelist, human approval for destructive actions, and SSH key-based authentication.
How It Works
Sentinel AI operates through an intelligent decision graph with six core stages:Diagnose
When an issue is detected, analyzes logs and errors using RAG to find the root cause from technical documentation.
Plan
Generates a step-by-step remediation plan based on retrieved knowledge and past successful resolutions.
Approve
Requests human approval for critical commands (e.g.,
rm, sudo) while auto-approving safe operations.Architecture Overview
Backend (Python)
The backend is built with modern Python technologies:- FastAPI: REST API and WebSocket endpoints for real-time communication
- LangGraph: Orchestrates the autonomous agent workflow as a state machine
- LlamaIndex: Manages RAG pipeline for document ingestion and retrieval
- Paramiko: Secure SSH client for remote server access
- Pinecone: Vector database storing embedded technical documentation
- Cohere: Re-ranks retrieved documents to ensure relevance
Frontend (TypeScript)
The dashboard provides a modern user experience:- Next.js 16: React framework with App Router for optimal performance
- Tailwind CSS: Utility-first styling for responsive design
- Shadcn UI: Accessible, customizable component library
- WebSocket: Real-time agent logs and event streaming
- Lucide React: Consistent iconography
Use Cases
Service Recovery
Automatically restart failed services like Nginx or PostgreSQL with proper diagnosis before action.
Log Analysis
Query the agent to analyze error logs and get AI-powered insights from technical manuals.
Configuration Fixes
Detect and repair misconfigurations based on best practices from official documentation.
Incident Response
Reduce MTTR (Mean Time To Recovery) by automating the initial diagnosis and remediation steps.
Safety and Security
Sentinel AI is designed with production safety in mind:- Command Whitelist: Only allows pre-approved safe commands by default
- Human-in-the-Loop: Critical actions trigger approval workflow via dashboard
- SSH Key Authentication: Secure, password-optional remote access
- Audit Logging: Complete trail of all agent decisions and actions
- Retry Limits: Prevents infinite loops with configurable max retries (default: 5)
- Escalation: Automatically escalates to humans when automation fails
What’s Next?
Quickstart
Get Sentinel AI running in minutes with our quickstart guide.
Installation
Detailed installation instructions for backend and frontend setup.