Overview
The RAG System (Retrieval-Augmented Generation) provides natural language intelligence over ML Defender’s security events. Using TinyLlama for language understanding and FAISS for vector search, it enables forensic queries like “Show me all ransomware detections from 10.0.0.50 last week” without SQL or log parsing.Components
- RAG Ingester: Log parsing + vector embeddings
- RAG Server: TinyLlama + FAISS query engine
- 4 FAISS Indices: Temporal, semantic, benign, malicious
- etcd Integration: Service discovery
Capabilities
- Natural Language Queries: Ask questions in English
- Temporal Analysis: “Last week”, “Yesterday morning”
- Pattern Recognition: “Similar to this attack”
- ML Retraining Data: Export feature vectors
Architecture
The RAG System consists of two symbiotic services that work together:RAG Ingester
Multi-Index Strategy
The Ingester maintains 4 specialized FAISS indices for different query patterns:- Chronos Index (Temporal)
- SBERT Index (Semantic)
- Entity Benign Index
- Entity Malicious Index
Dimensions: 128Purpose: Time-series queriesOptimized For:
- “Show me attacks from last week”
- “What happened on Monday between 2-4 PM?”
- “Hourly attack trends”
Eventual Consistency
The Ingester uses best-effort commits for high availability:Design Philosophy: Availability over Consistency. Better to have 3/4 indices working than to block and have 0/4.
Configuration
Threading Modes
- Single-threaded (Raspberry Pi)
- Multi-threaded (Server)
RAG Server (TinyLlama)
Natural Language Query Processing
The RAG Server uses TinyLlama (1.1B parameters) for query understanding:Query Understanding
User Query: “Show me all ransomware detections from 10.0.0.50 last week”TinyLlama Extracts:
Vector Search
FAISS Queries (parallel):
- Entity Malicious Index: Find all events from 10.0.0.50
- Chronos Index: Filter by time range (last week)
Example Queries
ML Retraining Data Export
The RAG System can export feature vectors for ML model retraining:- Model drift detection: Compare new data distribution vs training data
- Incremental training: Retrain RandomForest on recent attacks
- False positive analysis: Identify mislabeled events
Deployment
Prerequisites
Build RAG Ingester
Run RAG Ingester
Run RAG Server
Troubleshooting
FAISS Index Not Found
FAISS Index Not Found
ONNX Model Loading Fails
ONNX Model Loading Fails
TinyLlama Out of Memory
TinyLlama Out of Memory
Symptom: OOM error during query processingSolution: Use 4-bit quantization:Memory: 2GB → 600MB
File Watcher Not Detecting Files
File Watcher Not Detecting Files
Roadmap
Priority 1.1: Firewall Log Parsing
Goal: Ingest firewall-agent logs for ground truth linkingPriority 1.2: Temporal Queries
Goal: Natural language time expressionsPriority 1.3: Aggregation & Statistics
Goal: Summary queriesNext Steps
Sniffer
Configure network packet capture
ML Detector
Set up ML inference pipeline
Firewall Agent
Deploy autonomous blocking
Model Training
Retrain models with RAG-exported data