Learn more about Mintlify

Enter your email to receive updates about new features and product releases.

Monitoring Guide
Dashboard Setup
Wazuh Security Dashboard
Infrastructure Monitoring
Key Metrics to Monitor
Security Metrics
Alert Configuration and Tuning
Alert Levels
Wazuh Alert Configuration
IDS/IPS Alert Tuning
Prometheus Alerting Rules
Event Correlation Workflows
Multi-Source Correlation
Elasticsearch Query Correlation
Daily Monitoring Checklist
Best Practices
Monitoring Hygiene
Alert Response Priorities
Communication Protocols
Performance Optimization
Dashboard Performance
Query Optimization
Troubleshooting Common Issues
High Alert Volume
Missing Events
Dashboard Slowness
Related Resources

Monitoring Guide

This guide covers the daily monitoring operations for the Enterprise SOC, including dashboard configuration, key metrics, alert management, and event correlation workflows.

Dashboard Setup

Wazuh Dashboards

Central security event visualization and correlation platform for unified threat monitoring

Prometheus Metrics

Real-time metrics and alerting for infrastructure performance and availability

Elasticsearch Analytics

Deep log analysis and search capabilities for forensic investigation

Zabbix Infrastructure

Infrastructure health monitoring and availability tracking

Wazuh Security Dashboard

The Wazuh platform serves as the central hub for security event visualization:

Access the Wazuh Dashboard

Navigate to the Wazuh web interface and authenticate with your SOC credentials

Configure Security Overview

Enable key panels:

Security events summary
Top triggered rules
Alert evolution over time
Agent status overview

Set Up Custom Views

Create role-based dashboards for:

Tier 1 Analysts (high-priority alerts)
Tier 2 Analysts (investigation workflows)
SOC Manager (metrics and KPIs)

Enable Real-Time Monitoring

Configure auto-refresh intervals (recommended: 30-60 seconds for active monitoring)

Infrastructure Monitoring

Combine Zabbix and Prometheus for comprehensive infrastructure visibility. Zabbix excels at availability monitoring while Prometheus provides detailed metrics and alerting.

Key Zabbix Dashboards:

Network device availability
Server health (CPU, memory, disk)
Service status monitoring
Database performance

Key Prometheus Dashboards:

Container metrics (if using containerized deployments)
Application performance metrics
Custom security metrics
Resource utilization trends

Key Metrics to Monitor

Security Metrics

Critical Security Events

Failed Authentication Attempts: Monitor for brute force attacks
Privilege Escalation: Track sudo usage and administrative actions
File Integrity Violations: Critical system file modifications
Malware Detection: EDR alerts from Wazuh agents
Network Intrusions: IDS/IPS alerts from Snort and Suricata

Network Security

IDS/IPS Alert Volume: Track Snort and Suricata detection rates
Blocked Connections: Firewall deny logs
Unusual Traffic Patterns: Port scans, DDoS indicators
External Communications: Unexpected outbound connections
DNS Anomalies: DNS tunneling, DGA detection

Endpoint Security

Agent Health: Wazuh agent connectivity status
EDR Detections: Endpoint threats and suspicious behavior
Vulnerability Status: Unpatched systems count
Configuration Compliance: Policy violations
Process Anomalies: Unusual process execution

Performance Metrics

Log Ingestion Rate: Events per second in Logstash/Fluentd
Elasticsearch Cluster Health: Index status and performance
Query Response Time: Dashboard load times
Storage Utilization: Log retention capacity
Processing Lag: Pipeline delays

Alert Configuration and Tuning

Alert Levels

The SOC uses a tiered alert severity system:

Severity	Level	Response Time	Examples
Critical	12-15	Immediate	Active exploitation, data exfiltration
High	9-11	< 15 minutes	Malware detection, privilege escalation
Medium	6-8	< 1 hour	Policy violations, suspicious activity
Low	3-5	< 4 hours	Information events, minor anomalies
Informational	0-2	Daily review	Audit logs, routine events

Wazuh Alert Configuration

Review Default Rules

Examine Wazuh default ruleset and identify relevant rules for your environment

Create Custom Rules

Develop organization-specific rules in /var/ossec/etc/rules/local_rules.xml

Set Severity Thresholds

Configure alert levels based on business impact and threat severity

Configure Alert Destinations

Set up integrations:

TheHive for incident creation
Email notifications for critical alerts
Slack/Teams for team notifications

Enable Alert Grouping

Configure correlation to reduce alert fatigue and group related events

Avoid alert fatigue by tuning false positives aggressively. A high-noise environment leads to missed critical alerts.

IDS/IPS Alert Tuning

Snort and Suricata Configuration:

Start with conservative rulesets and gradually enable more aggressive detection rules as you tune false positives.

Enable Community Rules: Start with Emerging Threats or Snort Community rules
Suppress False Positives: Create suppression lists for known benign traffic
Custom Signatures: Develop environment-specific detection rules
Threshold Configuration: Set event thresholds to detect scanning and brute force
Regular Updates: Schedule weekly rule updates from threat intelligence feeds

Prometheus Alerting Rules

Configure alerting rules in Prometheus for infrastructure issues:

groups:
  - name: soc_infrastructure
    interval: 30s
    rules:
      - alert: HighCPUUsage
        expr: node_cpu_usage > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage detected"
      
      - alert: LogPipelineDown
        expr: up{job="logstash"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Log pipeline is down"

Event Correlation Workflows

Multi-Source Correlation

Wazuh provides powerful event correlation capabilities to detect complex attack patterns:

Identify Correlation Patterns

Define attack scenarios requiring multiple events:

Reconnaissance → Exploitation → Lateral Movement
Failed Login → Successful Login → Data Access
Port Scan → Vulnerability Exploit → Malware Execution

Configure Correlation Rules

Create correlation rules in Wazuh:

<rule id="100001" level="12">
  <if_matched_sid>5710</if_matched_sid>
  <same_source_ip />
  <description>Multiple failed logins followed by success</description>
</rule>

Integrate Multiple Sources

Correlate events from:

IDS/IPS alerts (Snort/Suricata)
Firewall logs
Endpoint EDR events
Authentication logs
Network flow data

Enrich with Context

Add threat intelligence and asset context to correlation events

Automate Response

Configure automatic incident creation in TheHive for correlated high-severity events

Elasticsearch Query Correlation

Use Elasticsearch for advanced correlation queries:

Elasticsearch Query DSL enables complex temporal and cross-index correlations that complement Wazuh rule-based detection.

Example: Detect Lateral Movement

{
  "query": {
    "bool": {
      "must": [
        {"match": {"event.type": "authentication"}},
        {"match": {"event.outcome": "success"}},
        {"range": {"@timestamp": {"gte": "now-1h"}}}
      ],
      "filter": {
        "script": {
          "script": "doc['source.ip'].value != doc['destination.ip'].value"
        }
      }
    }
  },
  "aggs": {
    "by_user": {
      "terms": {"field": "user.name"},
      "aggs": {
        "unique_hosts": {"cardinality": {"field": "destination.ip"}}
      }
    }
  }
}

Daily Monitoring Checklist

Never end a shift with unacknowledged critical alerts. Always ensure proper handover or escalation.

Best Practices

Monitoring Hygiene

Maintain a clean monitoring environment to ensure analysts can quickly identify genuine threats.

Tune Aggressively: Dedicate time weekly to reduce false positives
Document Everything: Maintain runbooks for common alert types
Baseline Normal: Understand normal behavior to identify anomalies
Regular Reviews: Weekly review of alert effectiveness and coverage
Continuous Learning: Stay updated on new attack techniques and adjust monitoring

Alert Response Priorities

Active Exploitation - Drop everything and respond
Data Exfiltration - Immediate containment required
Malware Execution - Isolate and investigate
Privilege Escalation - Verify legitimacy immediately
Failed Authentication Patterns - Monitor for escalation

Communication Protocols

Clear communication during security events is critical for effective response.

Critical Alerts: Immediately notify SOC lead and affected asset owners
Incidents: Create TheHive case and notify stakeholders
Ongoing Investigations: Regular updates every 2-4 hours
False Positives: Document in knowledge base to prevent future confusion
Shift Handover: Detailed written summary plus verbal briefing

Performance Optimization

Dashboard Performance

Limit time ranges for heavy queries (default: last 24 hours)
Use Elasticsearch aggregations instead of raw queries
Schedule resource-intensive reports during off-peak hours
Archive old indices to separate clusters if necessary

Query Optimization

Slow queries impact monitoring effectiveness. Optimize queries to return results in under 3 seconds.

Use index patterns efficiently
Filter at query time rather than post-processing
Leverage Elasticsearch field caching
Use time-based indices for log data

Troubleshooting Common Issues

High Alert Volume

Symptoms: Overwhelming number of alerts, analyst burnout Solutions:

Identify top noise generators using alert frequency analysis
Implement alert grouping and deduplication
Adjust severity levels for low-impact events
Create suppression rules for known false positives

Missing Events

Symptoms: Expected events not appearing in dashboards Solutions:

Check agent connectivity in Wazuh
Verify Logstash/Fluentd pipeline processing
Review Elasticsearch index health
Check log source configuration
Verify firewall rules allow log transmission

Dashboard Slowness

Symptoms: Queries taking > 10 seconds, timeouts Solutions:

Reduce query time range
Check Elasticsearch cluster health
Review index optimization status
Increase cluster resources if needed
Implement query result caching

Incident Handling - Procedures for responding to security incidents
Threat Hunting - Proactive threat detection techniques
Maintenance - System maintenance and tuning procedures

Configuration

Incident Handling

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Overview

Architecture Components

Deployment

Operations

Security

Monitoring Guide

Monitoring Guide

Dashboard Setup

Wazuh Dashboards

Prometheus Metrics

Elasticsearch Analytics

Zabbix Infrastructure

Wazuh Security Dashboard

Infrastructure Monitoring

Key Metrics to Monitor

Security Metrics

Alert Configuration and Tuning

Alert Levels

Wazuh Alert Configuration

IDS/IPS Alert Tuning

Prometheus Alerting Rules

Event Correlation Workflows

Multi-Source Correlation

Elasticsearch Query Correlation

Daily Monitoring Checklist

Best Practices

Monitoring Hygiene

Alert Response Priorities

Communication Protocols

Performance Optimization

Dashboard Performance

Query Optimization

Troubleshooting Common Issues

High Alert Volume

Missing Events

Dashboard Slowness

Build docs developers (and LLMs) love

Overview

Architecture Components

Deployment

Operations

Security

​Monitoring Guide

​Dashboard Setup

Wazuh Dashboards

Prometheus Metrics

Elasticsearch Analytics

Zabbix Infrastructure

​Wazuh Security Dashboard

​Infrastructure Monitoring

​Key Metrics to Monitor

​Security Metrics

​Alert Configuration and Tuning

​Alert Levels

​Wazuh Alert Configuration

​IDS/IPS Alert Tuning

​Prometheus Alerting Rules

​Event Correlation Workflows

​Multi-Source Correlation

​Elasticsearch Query Correlation

​Daily Monitoring Checklist

​Best Practices

​Monitoring Hygiene

​Alert Response Priorities

​Communication Protocols

​Performance Optimization

​Dashboard Performance

​Query Optimization

​Troubleshooting Common Issues

​High Alert Volume

​Missing Events

​Dashboard Slowness

​Related Resources

Build docs developers (and LLMs) love

Monitoring Guide

Dashboard Setup

Wazuh Security Dashboard

Infrastructure Monitoring

Key Metrics to Monitor

Security Metrics

Alert Configuration and Tuning

Alert Levels

Wazuh Alert Configuration

IDS/IPS Alert Tuning

Prometheus Alerting Rules

Event Correlation Workflows

Multi-Source Correlation

Elasticsearch Query Correlation

Daily Monitoring Checklist

Best Practices

Monitoring Hygiene

Alert Response Priorities

Communication Protocols

Performance Optimization

Dashboard Performance

Query Optimization

Troubleshooting Common Issues

High Alert Volume

Missing Events

Dashboard Slowness

Related Resources