Skip to main content

Overview

Multi-Cloud Manager collects a wide range of performance metrics from your virtual machines and container workloads. Metrics are collected at regular intervals and can be queried for real-time monitoring and historical analysis.

Azure VM Metrics

Available Metrics

Azure VMs provide platform-level metrics without requiring an agent:
# From vmmonitor.py:55-60
metric_names = [
    "Percentage CPU",
    "Available Memory Percentage",
    "Available Memory Bytes",
    "CPU Credits Consumed",
]
Metric NameDescriptionUnitAgent Required
Percentage CPUAverage CPU utilizationPercentNo
Available Memory PercentagePercentage of available memoryPercentYes (AMA)
Available Memory BytesAvailable physical memoryBytesYes (AMA)
CPU Credits ConsumedCPU credits used (burstable VMs)CountNo

Query VM Metrics

Metrics are collected using Azure Monitor’s MetricsQueryClient:
# Example from vmmonitor.py:61-67
response = client.query_resource(
    resource_uri=resource_id,
    metric_names=metric_names,
    timespan=(start_time, end_time),
    interval="PT5M",  # 5-minute intervals
    aggregations=[MetricAggregationType.AVERAGE]
)
Default Settings:
  • Time range: Last 1 hour
  • Interval: 5 minutes (PT5M)
  • Aggregation: Average

Example Response

{
  "vm": "webserver-01",
  "subscriptionId": "12345678-1234-1234-1234-123456789abc",
  "resourceGroup": "production-rg",
  "location": "eastus",
  "metrics": [
    {
      "name": "Percentage CPU",
      "unit": "Percent",
      "data": [
        {"timestamp": "2024-01-15T10:00:00Z", "average": 23.45},
        {"timestamp": "2024-01-15T10:05:00Z", "average": 25.12}
      ]
    },
    {
      "name": "Available Memory Bytes",
      "unit": "Bytes",
      "data": [
        {"timestamp": "2024-01-15T10:00:00Z", "average": 2147483648},
        {"timestamp": "2024-01-15T10:05:00Z", "average": 2013265920}
      ]
    }
  ]
}

Azure Container Metrics

Available Metrics

Azure Container Instances provide the following platform metrics:
# From containermonitor.py:88
metric_names = ["CpuUsage", "MemoryUsage"]
Metric NameDescriptionUnitInterval
CpuUsageCPU cores consumedCount1 minute
MemoryUsageMemory bytes usedBytes1 minute
Query Configuration:
  • Interval: PT1M (1 minute)
  • Aggregation: Average
  • Time range: Last 1 hour (configurable)

GCP VM Metrics

Agentless Metrics

Metrics available without installing Ops Agent:
# From vmmonitor.py:78-84
agentless_metrics = [
    {"type": "compute.googleapis.com/instance/cpu/utilization", "displayName": "Użycie CPU", "unit": "%"},
    {"type": "compute.googleapis.com/instance/network/received_bytes_count", "displayName": "Sieć (Odebrane)", "unit": "bajty"},
    {"type": "compute.googleapis.com/instance/network/sent_bytes_count", "displayName": "Sieć (Wysłane)", "unit": "bajty"},
    {"type": "compute.googleapis.com/instance/disk/read_bytes_count", "displayName": "Dysk (Odczyt)", "unit": "bajty"},
    {"type": "compute.googleapis.com/instance/disk/write_bytes_count", "displayName": "Dysk (Zapis)", "unit": "bajty"}
]

Agent-Based Metrics

Additional metrics available with Ops Agent:
# From vmmonitor.py:86-89
agent_metrics = [
    {"type": "agent.googleapis.com/memory/percent_used", "displayName": "Użycie pamięci (Agent)", "unit": "%"},
    {"type": "agent.googleapis.com/disk/percent_used", "displayName": "Użycie dysku (Agent)", "unit": "%"}
]
Metric TypeDescriptionRequires Ops Agent
compute.googleapis.com/instance/cpu/utilizationCPU utilization (0-1, multiply by 100 for %)No
compute.googleapis.com/instance/network/received_bytes_countNetwork bytes receivedNo
compute.googleapis.com/instance/network/sent_bytes_countNetwork bytes sentNo
compute.googleapis.com/instance/disk/read_bytes_countDisk bytes readNo
compute.googleapis.com/instance/disk/write_bytes_countDisk bytes writtenNo
agent.googleapis.com/memory/percent_usedMemory usage percentageYes
agent.googleapis.com/disk/percent_usedDisk usage percentageYes

Query GCP Metrics

# From vmmonitor.py:127-137
filter_query = f'metric.type = "{metric_type}" AND resource.labels.instance_id = "{instance_id}"'

req = {
    "name": project_name,
    "filter": filter_query,
    "interval": interval,
    "view": monitoring_v3.ListTimeSeriesRequest.TimeSeriesView.FULL,
    "aggregation": {
        "alignment_period": {"seconds": 60 * 5},  # 5 minutes
        "per_series_aligner": monitoring_v3.Aggregation.Aligner.ALIGN_MEAN
    }
}
Default Settings:
  • Alignment period: 5 minutes (300 seconds)
  • Aligner: ALIGN_MEAN (average)
  • Time range: Last 60 minutes (configurable)

GCP Cloud Run Metrics

Available Metrics

# From containermonitor.py:79-95
available_metrics = [
    {
        "type": "run.googleapis.com/request_count", 
        "displayName": "Liczba żądań", 
        "unit": "count"
    },
    {
        "type": "run.googleapis.com/request_latencies", 
        "displayName": "Opóźnienia żądań", 
        "unit": "ms"
    },
    {
        "type": "run.googleapis.com/container/instance_count", 
        "displayName": "Liczba instancji", 
        "unit": "count"
    }
]
Metric TypeDescriptionAggregation
run.googleapis.com/request_countTotal request countALIGN_SUM
run.googleapis.com/request_latenciesRequest latenciesALIGN_PERCENTILE_95
run.googleapis.com/container/instance_countActive container instancesALIGN_MEAN

Dynamic Aggregation

# From containermonitor.py:142-146
aggregation_alignment = monitoring_v3.Aggregation.Aligner.ALIGN_MEAN
if "count" in metric_type.lower():
    aggregation_alignment = monitoring_v3.Aggregation.Aligner.ALIGN_SUM
elif "latencies" in metric_type.lower():
    aggregation_alignment = monitoring_v3.Aggregation.Aligner.ALIGN_PERCENTILE_95

Data Collection Configuration

Azure Data Collection Rules

Define what performance counters to collect:
# From log_analytics.py:227-238
counters = [
    "\\Processor(_Total)\\% Processor Time",  # Windows format
    "\\Memory\\Available MBytes",
    "/builtin/memory/availablememorymbytes"   # Linux format
]

data_sources["performance_counters"] = [
    PerfCounterDataSource(
        name=perf_source_name,
        streams=["Microsoft-Perf"],
        sampling_frequency_in_seconds=60,
        counter_specifiers=counters
    )
]
Sampling Frequency:
  • Default: 60 seconds
  • Recommended range: 60-300 seconds
  • Minimum: 10 seconds (increases cost)
Setting sampling frequency below 60 seconds can significantly increase data ingestion costs.

Performance Considerations

Azure Monitor

  • Metric resolution: Azure stores metrics at 1-minute granularity
  • Retention: Platform metrics retained for 93 days
  • API limits: 1,500 metric API calls per subscription per region per minute

GCP Cloud Monitoring

  • Metric resolution: 1-minute minimum for custom metrics
  • Retention: 6 weeks for most metrics, up to 24 months for some
  • API quotas: 10 queries per second per project

Example: Query Multi-Cloud Metrics

Azure VM CPU Usage

GET /api/azure/vm/{vm_name}/monitor/metrics
Returns platform metrics including CPU, memory, and credits.

GCP VM with Custom Timespan

POST /api/gcp/vm/{project_id}/{instance_id}/metrics
Content-Type: application/json

{
  "metricType": "compute.googleapis.com/instance/cpu/utilization",
  "timespanMinutes": 120
}

Cloud Run Request Count

POST /api/gcp/container/{project_id}/{region}/{service_name}/metrics
Content-Type: application/json

{
  "metricType": "run.googleapis.com/request_count",
  "timespanMinutes": 60
}

Best Practices

  • Use 5-minute intervals for most monitoring scenarios
  • Use 1-minute intervals only for critical, real-time monitoring
  • Aggregate to larger intervals for long-term trending
  • Deploy Azure Monitor Agent for advanced metrics
  • Install GCP Ops Agent for memory and disk metrics
  • Keep agents updated for latest features and fixes
  • Always specify time ranges to reduce data scanned
  • Use resource-specific filters (instance_id, resource_id)
  • Request only the metrics you need
  • Check agent.googleapis.com/agent/uptime metric (GCP)
  • Verify DCR associations in Azure
  • Set up alerts for agent failures

Next Steps

Log Management

Query and analyze logs from VMs and containers

Monitoring Agents

Install AMA and Ops Agent for advanced metrics

Build docs developers (and LLMs) love