Skip to main content
Aurora receives alerts from your observability platforms via webhooks. This guide covers setting up Datadog, Grafana, PagerDuty, Netdata, Dynatrace, and Splunk.

How Alert Ingestion Works

When an alert fires:
  1. Observability platform sends a webhook to Aurora
  2. Aurora creates an incident in the incidents table
  3. A Celery background task starts the RCA investigation
  4. Alert details are stored in source-specific tables (datadog_events, grafana_alerts, etc.)
  5. The LangGraph agent analyzes the alert and executes diagnostic tools
# server/routes/incidents_routes.py:89-236
def _format_incident_response(row, include_metadata=False):
    # Incident structure includes:
    # - sourceType: datadog | grafana | pagerduty | netdata | splunk | dynatrace
    # - sourceAlertId: Reference to source table
    # - alert: { title, service, source, sourceUrl }
    # - auroraStatus: idle | running | complete | error

Datadog

1

Create a webhook integration

In Datadog:
  1. Go to Integrations → Webhooks
  2. Add a new webhook:
    • Name: Aurora RCA
    • URL: https://your-aurora-url/api/datadog/webhook
    • Payload: Default (Aurora parses standard Datadog format)
2

Add to monitor notifications

Edit your monitors to include:
@webhook-Aurora-RCA
Or set up a global notification rule for all critical alerts.
3

Configure Aurora

Aurora automatically detects Datadog alerts. Optionally configure:
# .env
NEXT_PUBLIC_ENABLE_DATADOG=true
4

Test the integration

Trigger a test alert or manually notify:
curl -X POST https://your-aurora-url/api/datadog/webhook \
  -H "Content-Type: application/json" \
  -d '{
    "alert_id": "12345",
    "alert_title": "High CPU on web-server",
    "alert_transition": "Triggered",
    "alert_metric": "system.cpu.user",
    "alert_status": "Alert"
  }'

Datadog Metadata

Aurora extracts:
  • Alert ID: alert_id from payload
  • Severity: Mapped from alert_status (Alert → high, Warn → medium)
  • Service: Parsed from tags or host
  • Source URL: Built using Datadog subdomain from OAuth settings
# server/routes/incidents_routes.py:72-74
elif source_type == "datadog":
    return f"https://app.{client_id}" if client_id else "https://app.datadoghq.com"

Grafana

1

Add a webhook contact point

In Grafana:
  1. Go to Alerting → Contact points
  2. Create a new contact point:
    • Type: Webhook
    • URL: https://your-aurora-url/api/grafana/webhook
    • HTTP Method: POST
2

Create a notification policy

Route alerts to the Aurora contact point:
  • Match all alerts, or filter by label (e.g., severity="critical")
3

Configure Aurora

Enable Grafana integration:
# .env
NEXT_PUBLIC_ENABLE_GRAFANA=true
4

Test the integration

Use Grafana’s “Test” button in the contact point settings, or trigger a real alert.

Grafana Alert Format

Grafana sends alerts in the Alertmanager format:
{
  "alerts": [
    {
      "status": "firing",
      "labels": {
        "alertname": "HighErrorRate",
        "severity": "critical",
        "service": "api"
      },
      "annotations": {
        "summary": "Error rate above threshold"
      },
      "fingerprint": "abc123"
    }
  ]
}
Aurora uses fingerprint as the source alert ID.

PagerDuty

Aurora supports PagerDuty via OAuth for deeper integration.
1

Enable PagerDuty OAuth

Set in .env:
NEXT_PUBLIC_ENABLE_PAGERDUTY_OAUTH=true
PAGERDUTY_CLIENT_ID=your-pd-client-id
PAGERDUTY_CLIENT_SECRET=your-pd-client-secret
2

Connect your PagerDuty account

  1. Go to Settings → Integrations → PagerDuty
  2. Click “Connect”
  3. Authorize Aurora to access incidents and escalation policies
3

Configure a webhook extension

In PagerDuty:
  1. Go to Services → Select a service → Integrations
  2. Add a new “Generic Webhook” extension
  3. Webhook URL: https://your-aurora-url/api/pagerduty/webhook
  4. Enable for incident triggers and updates
4

Test with a PagerDuty incident

Create a test incident. Aurora will:
  • Receive the webhook
  • Fetch additional incident details via PagerDuty API
  • Start an investigation

PagerDuty Features

Aurora consolidates all events for an incident:
# server/routes/incidents_routes.py:416-428
from routes.pagerduty.runbook_utils import fetch_and_consolidate_pagerduty_events

consolidated = fetch_and_consolidate_pagerduty_events(
    user_id, pagerduty_incident_id, cursor
)
# Combines: trigger, acknowledge, resolve, notes, timeline

Netdata

1

Configure Netdata webhook

Edit your Netdata config (/etc/netdata/health_alarm_notify.conf):
SEND_CUSTOM="YES"
DEFAULT_RECIPIENT_CUSTOM="aurora"
CUSTOM_WEBHOOK_URL="https://your-aurora-url/api/netdata/webhook"
2

Restart Netdata

sudo systemctl restart netdata
3

Test an alert

Trigger a test alarm:
sudo -u netdata /usr/libexec/netdata/plugins.d/alarm-notify.sh test

Netdata Alert Structure

Netdata provides rich context:
  • Chart: Which metric triggered the alert
  • Host: Affected system
  • Value: Current metric value vs. threshold
Aurora uses a composite key: {alert_name}:{host}:{chart}

Dynatrace

1

Create a webhook notification

In Dynatrace:
  1. Go to Settings → Integration → Problem notifications
  2. Add a custom integration:
    • Webhook URL: https://your-aurora-url/api/dynatrace/webhook
    • Call webhook when: Problem opened
2

Enable in Aurora

# .env
NEXT_PUBLIC_ENABLE_DYNATRACE=true
3

Configure Dynatrace API access (optional)

For enriched data, provide:
DYNATRACE_ENVIRONMENT_URL=https://{your-env}.live.dynatrace.com
DYNATRACE_API_TOKEN=your-api-token

Splunk

1

Set up a webhook alert action

In Splunk:
  1. Create or edit a saved search
  2. Add a “Trigger Actions” → Webhook
  3. URL: https://your-aurora-url/api/splunk/webhook
  4. Include alert metadata in the payload
2

Test the webhook

Run the search manually and verify Aurora receives the alert.

Custom Observability Tools

For tools not listed above, send alerts to Aurora’s generic webhook endpoint:
POST /api/incidents/create
Content-Type: application/json

{
  "alert_title": "High latency on API endpoint",
  "severity": "high",
  "service": "api-gateway",
  "environment": "production",
  "source_type": "custom",
  "metadata": {
    "endpoint": "/v1/users",
    "latency_ms": 5000,
    "threshold_ms": 1000
  }
}
Aurora will create an incident and start investigating.

Alert Correlation

Aurora automatically correlates related alerts:
# server/routes/incidents_routes.py:519-544
cursor.execute(
    """SELECT id, source_type, alert_title, alert_service, alert_severity,
              correlation_strategy, correlation_score, correlation_details
       FROM incident_alerts
       WHERE incident_id = %s"""
)
Correlation strategies:
  • Service match: Same service name
  • Time window: Within 5-15 minutes
  • Semantic similarity: Embedded alert titles using Weaviate
Correlated alerts are stored in incident_alerts table, not as separate incidents.

Alert Source URLs

Aurora generates deep links back to your observability platform:
# server/routes/incidents_routes.py:57-86
def _build_source_url(source_type: str, user_id: str) -> str:
    # Returns platform-specific URLs:
    # - Datadog: https://app.datadoghq.com/monitors/...
    # - Grafana: https://your-grafana.com/alerting/...
    # - PagerDuty: https://your-org.pagerduty.com/incidents/...

Webhook Security

Always use HTTPS for webhook URLs. Aurora validates webhook signatures where supported.
Enable rate limiting:
# .env
RATE_LIMITING_ENABLED=true
RATE_LIMIT_BYPASS_TOKEN=your-secure-token
Restrict webhook access using a reverse proxy (Nginx, Cloudflare, etc.):
location /api/*/webhook {
    # Allow only from observability tool IPs
    allow 54.164.185.0/24;  # Example: Datadog IP range
    deny all;
}

Webhook Payloads

Aurora stores the full webhook payload for forensics:
# Stored in source-specific tables:
# - datadog_events.payload (JSONB)
# - grafana_alerts.payload (JSONB)
# - pagerduty_events.payload (JSONB)
# - netdata_alerts.payload (JSONB)
# - splunk_alerts.payload (JSONB)
# - dynatrace_problems.payload (JSONB)
View raw payload in incident detail:
// GET /api/incidents/{id} response
{
  "alert": {
    "rawPayload": "{ ... full webhook JSON ... }"
  }
}

Troubleshooting

Check:
  1. Aurora server is accessible from the internet (or observability tool network)
  2. Webhook URL is correct (include /api/{source}/webhook)
  3. Firewall allows inbound HTTPS on port 443
Test manually:
curl -X POST https://your-aurora-url/api/datadog/webhook \
  -H "Content-Type: application/json" \
  -d '{ "alert_id": "test" }'
View Aurora server logs:
docker logs aurora-server | grep "webhook"
Common issues:
  • Invalid JSON payload
  • Missing required fields (alert_title, severity, etc.)
  • Database connection error
Check Celery worker:
docker logs aurora-celery_worker-1 | grep "RCA"
Verify:
  • Redis is running (Celery broker)
  • LLM API keys are configured
  • Cloud provider credentials are valid
Aurora infers severity and service from alert payload. Adjust mappings in:
server/routes/{source}_routes.py
Or include explicit fields in webhook payload.

Next Steps

First Investigation

Run your first incident investigation

Custom Connectors

Build integrations for proprietary tools

Build docs developers (and LLMs) love