Skip to main content
Monitor your Kubernetes cluster with a complete observability stack featuring Prometheus for metrics, Grafana for visualization, and Loki for log aggregation.

Prometheus and Grafana

Prerequisites

  • Running Kubernetes cluster
  • Helm installed
  • kubectl configured

Installing Lens (Optional)

Lens provides a desktop GUI for Kubernetes cluster management:
wget https://api.k8slens.dev/binaries/Lens-5.3.3-latest.20211223.1.amd64.deb
dpkg -i Lens-5.3.3-latest.20211223.1.amd64.deb
Copy your kubeconfig file from the manager node to ~/.kube/config on your workstation, then launch Lens with the lens command.

Install Prometheus Stack with Helm

1

Install kubectl (if needed)

snap install kubectl --classic
2

Install Helm

curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
sudo apt-get install apt-transport-https --yes
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
3

Add Prometheus Repository

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
4

Create Namespace

kubectl create ns monitoring
5

Install kube-prometheus-stack

helm install prometheus --namespace monitoring prometheus-community/kube-prometheus-stack
6

Verify Installation

kubectl get pods -n monitoring

Accessing Grafana

1

Check Grafana Service

kubectl get svc -n monitoring
2

Port Forward to Grafana

kubectl port-forward -n monitoring service/prometheus-grafana 3000:80
3

Access Dashboard

Open your browser and navigate to:
http://localhost:3000
Default credentials:
  • Username: admin
  • Password: prom-operator
The kube-prometheus-stack includes pre-configured dashboards for:
  • Cluster overview
  • Node metrics
  • Pod metrics
  • Persistent volumes
  • Kubernetes API server

Loki for Log Management

Loki is a log aggregation system designed to work seamlessly with Grafana.

Installation

1

Add Grafana Repository

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm search repo loki
2

Generate Custom Values

helm show values grafana/loki-stack > loki-values.yaml
3

Configure Loki Values

Edit loki-values.yaml with the following key configurations:
loki-values.yaml
loki:
  enabled: true
  isDefault: true
  url: http://{{(include "loki.serviceName" .)}}:{{ .Values.loki.service.port }}
  readinessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 45
  livenessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 45

promtail:
  enabled: true
  config:
    logLevel: info
    serverPort: 3101
    clients:
      - url: http://{{ .Release.Name }}:3100/loki/api/v1/push

grafana:
  enabled: true
  sidecar:
    datasources:
      enabled: true
      maxLines: 1000
  image:
    tag: 10.3.3
  service:
    type: NodePort

prometheus:
  enabled: false

fluent-bit:
  enabled: false

filebeat:
  enabled: false

logstash:
  enabled: false
4

Deploy Loki Stack

helm upgrade --install --values loki-values.yaml loki grafana/loki-stack -n grafana-loki --create-namespace
5

Verify Installation

kubectl get pods -n grafana-loki

Deploy Log Generator

Create a test application to generate logs:
kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: log-generator
  namespace: default
  labels:
    app: log-generator
spec:
  replicas: 1
  selector:
    matchLabels:
      app: log-generator
  template:
    metadata:
      labels:
        app: log-generator
    spec:
      containers:
        - name: log-generator
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["/bin/sh", "-c"]
          args:
            - >
              while true; do
                ts=$(date -u +"%Y-%m-%dT%H:%M:%SZ");
                echo "{\"timestamp\":\"${ts}\",\"level\":\"info\",\"message\":\"Hello from log-generator! Testing Loki JSON logs.\"}";
                sleep 5;
              done
          resources:
            limits:
              cpu: "100m"
              memory: "64Mi"
            requests:
              cpu: "50m"
              memory: "32Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: log-generator
  namespace: default
  labels:
    app: log-generator
spec:
  selector:
    app: log-generator
  ports:
    - port: 8080
      targetPort: 8080
      protocol: TCP
  type: ClusterIP
EOF

Accessing Loki Grafana

1

Get NodePort

kubectl get svc loki-grafana -n grafana-loki -o jsonpath="{.spec.ports[0].nodePort}"
2

Get Credentials

# Username
kubectl get secret loki-grafana -n grafana-loki -o jsonpath="{.data.admin-user}" | base64 --decode

# Password
kubectl get secret loki-grafana -n grafana-loki -o jsonpath="{.data.admin-password}" | base64 --decode
3

Access Dashboard

http://<NODE-IP>:<NODE-PORT>

LogQL Query Examples

Loki uses LogQL for querying logs:
{namespace="default"}

Grafana Dashboards

Pre-installed Dashboards

The kube-prometheus-stack includes:
  • Kubernetes / Compute Resources / Cluster: Overall cluster metrics
  • Kubernetes / Compute Resources / Namespace (Pods): Per-namespace pod metrics
  • Kubernetes / Compute Resources / Node (Pods): Per-node metrics
  • Node Exporter / Nodes: Detailed node statistics

Creating Custom Dashboards

1

Navigate to Dashboards

In Grafana, click DashboardsNew Dashboard
2

Add Panel

Click Add new panel
3

Write PromQL Query

Example queries:
# CPU usage by pod
sum(rate(container_cpu_usage_seconds_total[5m])) by (pod)

# Memory usage by namespace
sum(container_memory_usage_bytes) by (namespace)

# Pod restart count
kube_pod_container_status_restarts_total
4

Configure Visualization

Select chart type (Graph, Gauge, Table, etc.) and customize

Alerting with Prometheus

Create PrometheusRule

high-cpu-alert.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: high-cpu-usage
  namespace: monitoring
spec:
  groups:
  - name: cpu_alerts
    interval: 30s
    rules:
    - alert: HighCPUUsage
      expr: sum(rate(container_cpu_usage_seconds_total[5m])) by (pod) > 0.8
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High CPU usage detected"
        description: "Pod {{ $labels.pod }} is using more than 80% CPU"
kubectl apply -f high-cpu-alert.yaml

When to Use Loki

Choose Loki when:
  • You want a simple, scalable, and cost-effective logging solution
  • You’re operating in a cloud-native environment
  • You’re already using Grafana and Prometheus
  • Your log analysis needs are straightforward
  • You need to correlate logs with metrics in a single interface

Best Practices

  • Set appropriate retention policies for metrics and logs
  • Use persistent volumes for Prometheus and Loki data
  • Configure resource limits for monitoring components
  • Create alerts for critical metrics
  • Regularly review and clean up unused dashboards
  • Use label selectors efficiently to optimize queries
  • Enable authentication and authorization for Grafana
  • Export important dashboards as code for version control
Monitoring systems can consume significant resources. Always set resource limits and monitor the monitoring stack itself.

References

Build docs developers (and LLMs) love