Skip to main content

Overview

Azure Log Analytics workspaces provide centralized storage and analysis for logs and metrics from Azure resources. They are the foundation for Azure Monitor and enable powerful KQL queries across your infrastructure.

Log Analytics Workspace

Workspace Properties

A Log Analytics workspace stores:
  • Performance metrics from VMs and containers
  • System logs (Windows Event Logs, Syslog)
  • Application logs and custom data
  • Security and audit logs
  • Diagnostic logs from Azure services

List Workspaces

Retrieve all workspaces in a subscription:
# From log_analytics.py:32-59
GET /api/azure/log-analytics?subscriptionId={subscription_id}
Response:
{
  "value": [
    {
      "name": "production-logs",
      "id": "/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/production-logs",
      "location": "eastus",
      "workspaceGuid": "12345678-1234-1234-1234-123456789abc",
      "sku": "PerGB2018",
      "retentionInDays": 30,
      "resourceGroup": "monitoring-rg"
    }
  ]
}
Implementation:
# From log_analytics.py:40-57
credential = FlaskCredential()
client = LogAnalyticsManagementClient(credential, subscription_id)

workspaces = client.workspaces.list()

for ws in workspaces:
    result.append({
        "name": ws.name,
        "id": ws.id,
        "location": ws.location,
        "workspaceGuid": ws.customer_id,
        "sku": ws.sku.name if ws.sku else None,
        "retentionInDays": ws.retention_in_days,
        "resourceGroup": ws.id.split('/')[4]
    })

Create Workspace

Create a new Log Analytics workspace:
POST /api/azure/log-analytics/create
Content-Type: application/json

{
  "subscriptionId": "12345678-1234-1234-1234-123456789abc",
  "rgName": "monitoring-rg",
  "workspaceName": "production-logs",
  "location": "eastus",
  "sku": "PerGB2018",
  "retentionInDays": 30
}
Parameters:
ParameterRequiredDefaultDescription
subscriptionIdYes-Azure subscription ID
rgNameYes-Resource group name
workspaceNameYes-Workspace name (unique within RG)
locationNowesteuropeAzure region
skuNoPerGB2018Pricing tier
retentionInDaysNo30Data retention period

SKU Options

SKUDescriptionUse Case
Free500 MB/day limit, 7-day retentionTesting and evaluation
PerGB2018Pay per GB ingestedProduction workloads
PerNodePer-node pricingLarge deployments with System Center
StandaloneLegacy SKUNot recommended for new workspaces
Use PerGB2018 for most production workloads. It offers flexible pricing and long retention.

Implementation

# From log_analytics.py:67-114
credential = FlaskCredential()
client = LogAnalyticsManagementClient(credential, subscription_id)

# Check if workspace already exists
try:
    existing = client.workspaces.get(rg_name, workspace_name)
    return jsonify({
        "message": "Workspace już istnieje",
        "workspace": {
            "name": existing.name,
            "id": existing.id,
            "location": existing.location
        }
    }), 200
except ResourceNotFoundError:
    pass

# Create new workspace
ws_params = {
    "location": location,
    "sku": {"name": sku},
    "retention_in_days": retention
}

poller = client.workspaces.begin_create_or_update(rg_name, workspace_name, ws_params)
ws = poller.result()
Response:
{
  "message": "Workspace utworzony",
  "workspace": {
    "name": "production-logs",
    "id": "/subscriptions/{sub-id}/resourceGroups/monitoring-rg/providers/Microsoft.OperationalInsights/workspaces/production-logs",
    "location": "eastus"
  }
}

Data Collection Rules (DCR)

Data Collection Rules define what data to collect from VMs and where to send it.

List DCRs for a VM

GET /api/azure/vm/{vm_name}/dcr?workspaceId={workspace_id}
Optional filtering:
  • Include workspaceId to show only DCRs sending to that workspace
  • Omit workspaceId to show all DCRs associated with the VM
Implementation:
# From log_analytics.py:122-176
monitor_client = MonitorManagementClient(credential, subscription_id)

# Get all associations for this VM
associations = monitor_client.data_collection_rule_associations.list_by_resource(
    resource_uri=vm_resource_id
)

for assoc in associations:
    dcr_id = assoc.data_collection_rule_id
    
    # Get DCR details to check destination workspace
    if target_workspace_id:
        dcr_details = monitor_client.data_collection_rules.get(dcr_rg, dcr_name)
        dcr_destination_ws_id = dcr_details.destinations.log_analytics[0].workspace_resource_id
        
        if dcr_destination_ws_id != target_workspace_id:
            continue  # Skip DCRs for other workspaces

Create DCR and Associate VM

Create a Data Collection Rule and associate it with a VM:
POST /api/azure/log-analytics/dcr/create
Content-Type: application/json

{
  "subscriptionId": "12345678-1234-1234-1234-123456789abc",
  "resourceGroup": "monitoring-rg",
  "dcrName": "vm-monitoring-dcr",
  "location": "eastus",
  "workspaceId": "/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/production-logs",
  "vmResourceId": "/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Compute/virtualMachines/vm-01",
  "collectPerformance": true,
  "collectSystemLogs": true
}
Parameters:
ParameterRequiredTypeDescription
subscriptionIdYesstringAzure subscription ID
resourceGroupYesstringResource group for DCR
dcrNameYesstringDCR name
locationYesstringAzure region (match VM location)
workspaceIdYesstringFull workspace resource ID
vmResourceIdYesstringFull VM resource ID
collectPerformanceNobooleanCollect performance counters (default: true)
collectSystemLogsNobooleanCollect system logs (default: true)

DCR Configuration

Performance Counters

# From log_analytics.py:224-238
if collect_performance:
    counters = [
        "\\Processor(_Total)\\% Processor Time",
        "\\Memory\\Available MBytes",
        "/builtin/memory/availablememorymbytes"
    ]
    
    data_sources["performance_counters"] = [
        PerfCounterDataSource(
            name="perfCounters",
            streams=["Microsoft-Perf"],
            sampling_frequency_in_seconds=60,
            counter_specifiers=counters
        )
    ]
Counter formats:
  • Windows: \\Category\\Counter
  • Linux: /category/counter
Common counters:
  • CPU: \Processor(_Total)\% Processor Time
  • Memory: \Memory\Available MBytes
  • Disk: \PhysicalDisk(_Total)\Disk Reads/sec
  • Network: \Network Interface(*)\Bytes Total/sec

Windows Event Logs

# From log_analytics.py:241-250
if os_type == "Windows" and collect_system_logs:
    data_sources["windows_event_logs"] = [
        WindowsEventLogDataSource(
            name="windowsEventLogs",
            streams=["Microsoft-WindowsEvent"],
            x_path_queries=[
                "System!*[System[(Level=1 or Level=2 or Level=3)]]",
                "Application!*[System[(Level=1 or Level=2 or Level=3)]]"
            ]
        )
    ]
Event levels:
  • Level 1: Critical
  • Level 2: Error
  • Level 3: Warning
  • Level 4: Information
  • Level 5: Verbose

Linux Syslog

# From log_analytics.py:251-260
if os_type == "Linux" and collect_system_logs:
    data_sources["syslog"] = [
        SyslogDataSource(
            name="linuxSyslog",
            streams=["Microsoft-Syslog"],
            facility_names=["*"],
            log_levels=["*"]
        )
    ]
Syslog facilities:
  • auth: Authentication/authorization
  • cron: Scheduled tasks
  • daemon: System daemons
  • kern: Kernel messages
  • syslog: Syslog internal messages
  • user: User-level messages
Syslog levels:
  • Emergency, Alert, Critical, Error, Warning, Notice, Info, Debug

Data Flow Configuration

# From log_analytics.py:262-266
dcr_config = DataCollectionRuleResource(
    location=location,
    data_sources=data_sources,
    destinations={"log_analytics": [LogAnalyticsDestination(name="laDestination", workspace_resource_id=workspace_id)]},
    data_flows=data_flows
)
Data flow example:
data_flows.append(
    DataFlow(
        streams=["Microsoft-Perf"],
        destinations=["laDestination"]
    )
)

VM Association

After creating the DCR, associate it with the VM:
# From log_analytics.py:274-283
association_name = f"{vm_name}-{dcr_name}-assoc"
association_resource_id = f"{vm_resource_id}/providers/Microsoft.Insights/dataCollectionRuleAssociations/{association_name}"

association_payload = {
    "properties": {
        "dataCollectionRuleId": dcr.id
    }
}

poller_assoc = resource_client.resources.begin_create_or_update_by_id(
    resource_id=association_resource_id,
    api_version="2021-09-01-preview",
    parameters=association_payload
)
Response:
{
  "message": "DCR 'vm-monitoring-dcr' utworzony i pomyślnie skojarzony z VM.",
  "dcrId": "/subscriptions/{sub-id}/resourceGroups/monitoring-rg/providers/Microsoft.Insights/dataCollectionRules/vm-monitoring-dcr",
  "associationId": "/subscriptions/{sub-id}/resourceGroups/vm-rg/providers/Microsoft.Compute/virtualMachines/vm-01/providers/Microsoft.Insights/dataCollectionRuleAssociations/vm-01-vm-monitoring-dcr-assoc"
}

Data Ingestion

How It Works

1

Agent Installation

Azure Monitor Agent (AMA) is installed on the VM as an extension.
2

DCR Association

The VM is associated with one or more Data Collection Rules.
3

Data Collection

AMA collects metrics and logs according to DCR specifications.
4

Data Transmission

Collected data is sent to the Log Analytics workspace.
5

Data Processing

Workspace processes and indexes the data for querying.

Ingestion Latency

  • Platform metrics: Near real-time (1-3 minutes)
  • Agent metrics: 3-5 minutes
  • Logs: 5-10 minutes average
  • Custom logs: 5-15 minutes
First-time data collection may take 10-15 minutes to appear in the workspace.

Workspace Queries

Query workspace data using the Logs Query API:
# From log_analytics.py:343-347
response = client.query_workspace(
    workspace_id=workspace_guid,  # Customer ID (GUID), not resource ID
    query=query,
    timespan=timedelta(hours=timespan_hours)
)
Important: Use the workspace GUID (customer_id), not the full resource ID.

Process Query Results

# From log_analytics.py:349-361
if response.status == LogsQueryStatus.SUCCESS and response.tables:
    table = response.tables[0]
    
    # Get column names
    header = table.columns
    
    # Process rows
    result_list = []
    for row in table.rows:
        row_data = [str(item) if isinstance(item, datetime) else item for item in row]
        result_list.append(dict(zip(header, row_data)))

Cost Management

Pricing Components

  1. Data ingestion: Per GB ingested
  2. Data retention: Free for first 31 days, then per GB/month
  3. Data export: Per GB exported (Continuous Export feature)
  4. Queries: Generally free, some premium features charged

Cost Optimization

  • Collect only necessary performance counters
  • Filter event logs by severity (Error/Warning only)
  • Increase sampling frequency (60s → 300s)
  • Use separate workspaces for dev/test with shorter retention
  • Use 30 days for operational data
  • Use 90+ days only for compliance requirements
  • Archive old data to Azure Storage for long-term retention
  • Delete or filter noisy, low-value logs
Use the Usage table to track daily ingestion:
Usage
| where TimeGenerated > ago(7d)
| where IsBillable == true
| summarize TotalGB = sum(Quantity) / 1000 by bin(StartTime, 1d), DataType
| order by StartTime desc

Permissions

Required Roles

OperationRequired Role
Create workspaceContributor on resource group
Create DCRMonitoring Contributor
Associate DCR with VMVM Contributor + Monitoring Contributor
Query logsLog Analytics Reader
Export logsLog Analytics Reader

Service Principal Setup

The application uses service principal authentication:
# From log_analytics.py:28-30
CLIENT_ID = os.getenv("AZURE_CLIENT_ID")
CLIENT_SECRET = os.getenv("AZURE_CLIENT_SECRET")
TENANT_ID = os.getenv("AZURE_TENANT_ID")
Required permissions:
  • Log Analytics Contributor role on the workspace
  • Monitoring Contributor role on VMs and resource groups
  • Microsoft.Insights/dataCollectionRules/* permissions

Troubleshooting

Possible causes:
  • Azure Monitor Agent not installed
  • No DCR associated with VM
  • DCR not configured correctly
  • Workspace in different region than DCR
  • Network connectivity issues
Check:
# Verify AMA installation
GET /api/azure/vm/{vm_name}/agent/status

# List DCR associations
GET /api/azure/vm/{vm_name}/dcr
Common errors:
  • VM and DCR in different regions (must match)
  • Invalid workspace resource ID
  • Missing permissions
  • VM not running
Solution: Ensure DCR location matches VM location and workspace is accessible.
  • Wait 10-15 minutes after first DCR association
  • Verify workspace GUID is correct
  • Check time range in query
  • Ensure VM hostname matches Computer field
  • Validate KQL syntax

Best Practices

Workspace Design

  • One workspace per environment (prod/dev/test)
  • Separate workspace for security logs
  • Use resource groups to organize workspaces
  • Name workspaces descriptively

DCR Strategy

  • Create DCR templates for VM roles
  • Use one DCR per VM type (web/database/worker)
  • Keep DCRs in the same region as VMs
  • Document counter selections

Data Collection

  • Start with minimal data, expand as needed
  • Collect errors/warnings, avoid verbose logs
  • Use 60-second sampling for performance counters
  • Review and adjust quarterly

Security

  • Use managed identities when possible
  • Rotate service principal secrets regularly
  • Apply least-privilege access
  • Enable workspace diagnostics

Next Steps

Monitoring Agents

Install Azure Monitor Agent on VMs

Log Management

Query logs with KQL examples

Build docs developers (and LLMs) love