Skip to main content
Set up threshold-based alerts to monitor VM health and performance across Azure and GCP. Get notified when metrics exceed defined thresholds.

GCP Alert Policies

Google Cloud Monitoring provides comprehensive alert policy management for VM instances.

List VM Alerts

Retrieve all alert policies configured for a specific VM.

Endpoint

GET /api/gcp/vms/<project_id>/<instance_id>/alerts

Response

{
  "value": [
    {
      "name": "1234567890123456789",
      "displayName": "High CPU Usage Alert",
      "enabled": true,
      "description": "Alert when CPU utilization exceeds 80%"
    },
    {
      "name": "9876543210987654321",
      "displayName": "Memory Threshold Alert",
      "enabled": true,
      "description": "Brak opisu."
    }
  ]
}

Response Fields

name
string
Alert policy identifier (last segment of the full policy name)
displayName
string
Human-readable name for the alert policy
enabled
boolean
Whether the alert policy is currently active
description
string
Alert policy documentation/description

Filtering Logic

The API filters alert policies by checking if conditions contain:
f'resource.labels.instance_id = "{instance_id}"'
Supported condition types:
  • condition_threshold - Metric exceeds threshold
  • condition_absent - Metric data is missing

Create Alert Policy

Configure a new threshold-based alert for a VM.

Endpoint

POST /api/gcp/vms/<project_id>/<instance_id>/alerts/create

Request Parameters

alertName
string
required
Display name for the alert policy
metricType
string
required
Metric to monitor (e.g., compute.googleapis.com/instance/cpu/utilization)
threshold
number
required
Threshold value that triggers the alert

Request Example

{
  "alertName": "High CPU Usage Alert",
  "metricType": "compute.googleapis.com/instance/cpu/utilization",
  "threshold": 0.8
}

Response

{
  "message": "Utworzono alert 'High CPU Usage Alert'. (Uwaga: nie skonfigurowano kanałów notyfikacji).",
  "name": "1234567890123456789",
  "displayName": "High CPU Usage Alert"
}

Alert Configuration

The created alert policy includes:
filter = (
    f'metric.type = "{metric_type}" AND '
    f'resource.type = "gce_instance" AND '
    f'resource.labels.instance_id = "{instance_id}"'
)
Ensures the alert only monitors the specified VM instance.
  • Alignment Period: 60 seconds
  • Aligner: ALIGN_MEAN (average value)
  • Per-series aggregation: Mean value over the alignment period
  • Comparison: COMPARISON_GT (greater than)
  • Threshold Value: User-specified (e.g., 0.8 for 80%)
  • Duration: 300 seconds (5 minutes)
  • Trigger: 1 violation required
Alert fires when metric exceeds threshold for 5 consecutive minutes.
  • Type: AND
  • All conditions must be met to trigger the alert
Notification channels are not automatically configured. Add channels through the GCP Console or use the Notification Channels API.

Delete Alert Policy

Remove an existing alert policy.

Endpoint

DELETE /api/gcp/vms/<project_id>/<alert_name>/alert

Request

No request body required. The alert name is specified in the URL.

Response

{
  "message": "Alert '1234567890123456789' został pomyślnie usunięty."
}

Full Policy Name

The API constructs the full resource name:
f"projects/{project_id}/alertPolicies/{alert_name}"

Common Alert Examples

High CPU Utilization

{
  "alertName": "High CPU Alert",
  "metricType": "compute.googleapis.com/instance/cpu/utilization",
  "threshold": 0.8
}
Triggers when CPU usage exceeds 80% for 5 minutes.

Memory Usage

{
  "alertName": "High Memory Alert",
  "metricType": "agent.googleapis.com/memory/percent_used",
  "threshold": 85
}
Triggers when memory usage exceeds 85% for 5 minutes.

Disk Usage

{
  "alertName": "Disk Space Alert",
  "metricType": "agent.googleapis.com/disk/percent_used",
  "threshold": 90
}
Triggers when disk usage exceeds 90% for 5 minutes.

Network Traffic

{
  "alertName": "High Network Input",
  "metricType": "compute.googleapis.com/instance/network/received_bytes_count",
  "threshold": 1000000000
}
Triggers when received bytes exceed 1 GB over 5 minutes.

Azure Alert Policies

Azure alert policy management is not yet implemented in the current API. Azure Monitor alerts can be configured through:
  • Azure Portal
  • Azure CLI
  • Azure Monitor REST API
  • ARM templates

Future Implementation

Planned endpoints for Azure alert management:
# List alerts
GET /api/azure/vms/<vm_name>/alerts

# Create alert
POST /api/azure/vms/<vm_name>/alerts/create

# Delete alert
DELETE /api/azure/vms/<vm_name>/alerts/<alert_id>

Azure Monitor Alert Rules

When implementing Azure alerts, use these metric types:
MetricNamespaceAggregation
Percentage CPUMicrosoft.Compute/virtualMachinesAverage
Available Memory BytesMicrosoft.Compute/virtualMachinesAverage
Network In TotalMicrosoft.Compute/virtualMachinesTotal
Network Out TotalMicrosoft.Compute/virtualMachinesTotal
Disk Read BytesMicrosoft.Compute/virtualMachinesTotal
Disk Write BytesMicrosoft.Compute/virtualMachinesTotal

Error Handling

{
  "error": "Nie znaleziono aktywnego konta GCP w sesji"
}
Solution: Reauthenticate through the OAuth flow.

Required IAM Permissions (GCP)

To manage alert policies, the service account needs:
# Read permissions
- monitoring.alertPolicies.get
- monitoring.alertPolicies.list

# Write permissions
- monitoring.alertPolicies.create
- monitoring.alertPolicies.delete
- monitoring.alertPolicies.update

# Predefined role
roles/monitoring.alertPolicyEditor

Notification Channels

Alert policies can send notifications through various channels:

Email

Send alerts to email addresses

SMS

Text message notifications

Slack

Post alerts to Slack channels

PagerDuty

Integrate with PagerDuty incidents

Webhooks

Custom HTTP endpoints

Pub/Sub

GCP Pub/Sub topics

Configure Notification Channels

Notification channels must be created separately:
# GCP CLI example
gcloud alpha monitoring channels create \
  --display-name="Email Notifications" \
  --type=email \
  [email protected]
Then link channels to alert policies through the GCP Console or API.

Code Reference

GCP Alert Implementation

  • List Alerts: backend/gcp/vmmonitor.py:379-428
  • Create Alert: backend/gcp/vmmonitor.py:430-502
  • Delete Alert: backend/gcp/vmmonitor.py:504-528

Dependencies

  • google-cloud-monitoring>=2.0.0
  • google-cloud-logging>=3.0.0

Best Practices

  • CPU: 70-90% for sustained load alerts
  • Memory: 80-90% to prevent OOM errors
  • Disk: 85-95% to allow cleanup time
  • Network: Based on bandwidth capacity
Use 5-minute durations to avoid false positives from temporary spikes:
  • Short spikes: Normal behavior, don’t alert
  • Sustained issues: Real problems, alert immediately
Use descriptive names that include:
  • Metric being monitored
  • Threshold value
  • Severity level
Example: CRITICAL: CPU > 90% for 5min
Include runbook steps in alert descriptions:
  • Diagnostic commands
  • Common causes
  • Remediation steps

Next Steps

VM Monitoring

Configure metrics and agents

GCP VMs

Manage GCP VM instances

Build docs developers (and LLMs) love