Monitoring & Observability - SGIVU Config Repository

Overview

SGIVU microservices use Spring Boot Actuator for health checks and metrics, combined with Zipkin for distributed tracing. This guide shows how to monitor the Config Server and microservices.

Spring Boot Actuator Configuration

All microservices expose actuator endpoints for health monitoring and operational insights.

Production Configuration

In production profiles, services expose minimal endpoints for security:

management:
  endpoints:
    web:
      exposure:
        include: health, info
  endpoint:
    health:
      show-details: never

The show-details: never setting prevents sensitive information from being exposed in health endpoint responses.

Development Configuration

Development environments expose all actuator endpoints for debugging:

management:
  endpoints:
    web:
      exposure:
        include: "*"
  endpoint:
    health:
      show-details: always

Never use include: "*" in production. It exposes sensitive endpoints like /actuator/env, /actuator/beans, and /actuator/mappings.

Health Check Endpoints

Accessing Health Status

Each microservice exposes a health check endpoint:

Config Server

curl http://localhost:8888/actuator/health

Discovery Server

curl http://localhost:8761/actuator/health

API Gateway

curl http://localhost:8080/actuator/health

Auth Service

curl http://localhost:9000/actuator/health

Health Response Format

Production (show-details: never):

{
  "status": "UP"
}

Development (show-details: always):

{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "details": {
        "database": "PostgreSQL",
        "validationQuery": "isValid()"
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 250685575168,
        "free": 150685575168,
        "threshold": 10485760
      }
    },
    "ping": {
      "status": "UP"
    },
    "redis": {
      "status": "UP",
      "details": {
        "version": "7.0.5"
      }
    }
  }
}

Health status can be UP, DOWN, OUT_OF_SERVICE, or UNKNOWN. Services should be monitored for non-UP statuses.

Info Endpoint

The /actuator/info endpoint provides application metadata:

curl http://localhost:8888/actuator/info

You can customize this endpoint by adding properties:

info:
  app:
    name: ${spring.application.name}
    version: @project.version@
    environment: ${spring.profiles.active}

Distributed Tracing with Zipkin

Zipkin Configuration

All microservices are configured to send trace data to Zipkin:

management:
  tracing:
    sampling:
      probability: 0.1
  zipkin:
    tracing:
      endpoint: http://sgivu-zipkin:9411/api/v2/spans

Understanding Sampling Probability

The probability setting controls what percentage of requests are traced:

0.1 (10%)

Production default - traces 1 in 10 requests to reduce overhead

0.5 (50%)

Higher sampling for debugging issues in staging

1.0 (100%)

Development - trace every request (high overhead)

Accessing Zipkin Dashboard

Once Zipkin is running, access the dashboard:

open http://localhost:9411

Trace Search and Analysis

Find Traces

Search by service name, time range, or trace ID in the Zipkin UI.

View Trace Timeline

Click a trace to see the request flow across microservices with timing information.

Identify Bottlenecks

Look for spans with long durations to identify performance issues.

Analyze Dependencies

Use the Dependencies tab to visualize service communication patterns.

Example Trace Flow

A typical request through SGIVU might create this trace:

sgivu-gateway        [200ms] - Client request received
└─ sgivu-auth        [50ms]  - Token validation
   └─ sgivu-user     [100ms] - Fetch user data
      └─ PostgreSQL  [30ms]  - Database query

Monitoring Config Server

Config Server Health

The Config Server health includes Git repository connectivity:

curl http://localhost:8888/actuator/health

Response includes:

{
  "status": "UP",
  "components": {
    "configServer": {
      "status": "UP",
      "details": {
        "repositories": [
          {
            "name": "sgivu-config-repo",
            "profiles": ["dev", "prod"],
            "label": "main"
          }
        ]
      }
    }
  }
}

If Config Server health is DOWN, microservices cannot fetch configuration and may fail to start.

Monitoring Configuration Retrieval

Test that services can retrieve their configuration:

# Fetch auth service dev config
curl http://localhost:8888/sgivu-auth/dev

# Fetch gateway service prod config
curl http://localhost:8888/sgivu-gateway/prod

# Fetch user service default config
curl http://localhost:8888/sgivu-user/default

Successful responses indicate the Config Server is correctly serving configuration.

Service-Level Monitoring

Key Metrics by Service

sgivu-auth (Port 9000)

Key Metrics:

OAuth2 token generation rate
JWT signing operations
Session creation/validation
Database connection pool usage

Health Dependencies:

PostgreSQL database
JWT keystore availability
Eureka registration

sgivu-gateway (Port 8080)

Key Metrics:

Request throughput and latency
OAuth2 token relay success rate
Redis session operations
Circuit breaker status

Health Dependencies:

Redis connection
Auth service availability
Eureka service discovery

sgivu-user (Port 8081)

Key Metrics:

User lookup latency
Database query performance
Internal API call success rate

Health Dependencies:

PostgreSQL database
Service internal authentication
Eureka registration

sgivu-client (Port 8082)

Key Metrics:

Client data operations
Database transaction rate
API response times

Health Dependencies:

PostgreSQL database
Service internal authentication
Eureka registration

sgivu-vehicle (Port 8083)

Key Metrics:

Vehicle image upload success rate
S3 operation latency
File size and multipart upload metrics

Health Dependencies:

PostgreSQL database
AWS S3 connectivity
Service internal authentication

Logging Configuration

Production Logging

Production services use INFO level logging:

logging:
  level:
    root: INFO

Development Debugging

Development profiles enable detailed logging for troubleshooting:

logging:
  level:
    com.sgivu.gateway.security: DEBUG
    com.sgivu.gateway.controller: DEBUG
    org.springframework.security.oauth2.client: DEBUG
    org.springframework.security.web.server.authentication: DEBUG
    org.springframework.session.web.server: DEBUG

Debug logging is useful for tracing OAuth2 flows, session management, and security filters during development.

Automated Monitoring Setup

Docker Compose Health Checks

Add health checks to your docker-compose.yml:

services:
  sgivu-config:
    image: sgivu/config-server:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8888/actuator/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Kubernetes Probes

For Kubernetes deployments:

apiVersion: v1
kind: Pod
metadata:
  name: sgivu-auth
spec:
  containers:
  - name: auth
    image: sgivu/auth-service:latest
    livenessProbe:
      httpGet:
        path: /actuator/health
        port: 9000
      initialDelaySeconds: 60
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /actuator/health
        port: 9000
      initialDelaySeconds: 30
      periodSeconds: 5

Monitoring Best Practices

Set Up Alerts

Configure alerts for DOWN health status, high error rates, or slow response times.

Monitor Dependencies

Track database connections, Redis availability, and Eureka registration status.

Track Trends

Collect metrics over time to identify gradual degradation or capacity issues.

Correlate Logs

Use trace IDs to correlate logs across services when debugging issues.

Troubleshooting Common Issues

Service Reports DOWN

Check Actuator Health

curl http://localhost:8080/actuator/health

Look at component status to identify which dependency is failing.

Verify Dependencies

Check that databases, Redis, and other services are running and accessible.

Review Logs

Look for connection errors, timeouts, or authentication failures in application logs.

Test Config Retrieval

Ensure the service can fetch its configuration from Config Server.

High Latency in Traces

Identify the slow span in Zipkin
Check database query performance if the slow span is a repository call
Review network latency between services
Check resource utilization (CPU, memory, disk I/O)

Missing Traces

Verify Zipkin is running and accessible
Check that management.zipkin.tracing.endpoint is correctly configured
Increase sampling.probability temporarily for debugging
Review service logs for Zipkin connection errors

Configuration Refresh

Update monitoring configuration without restarts

Security

Secure actuator endpoints in production

Validation

Validate configuration changes before deployment

Troubleshooting

Debug common configuration issues

Overview

Getting Started

Service Configuration

Environment Management

Operations

​Overview

​Spring Boot Actuator Configuration

​Production Configuration

​Development Configuration

​Health Check Endpoints

​Accessing Health Status

Config Server

Discovery Server

API Gateway

Auth Service

​Health Response Format

​Info Endpoint

​Distributed Tracing with Zipkin

​Zipkin Configuration

​Understanding Sampling Probability

0.1 (10%)

0.5 (50%)

1.0 (100%)

​Accessing Zipkin Dashboard

​Trace Search and Analysis

​Example Trace Flow

​Monitoring Config Server

​Config Server Health

​Monitoring Configuration Retrieval

​Service-Level Monitoring

​Key Metrics by Service

​Logging Configuration

​Production Logging

​Development Debugging

​Automated Monitoring Setup

​Docker Compose Health Checks

​Kubernetes Probes

​Monitoring Best Practices

Set Up Alerts

Monitor Dependencies

Track Trends

Correlate Logs

​Troubleshooting Common Issues

​Service Reports DOWN

​High Latency in Traces

​Missing Traces

​Related Resources

Configuration Refresh

Security

Validation

Troubleshooting

Build docs developers (and LLMs) love

Overview

Spring Boot Actuator Configuration

Production Configuration

Development Configuration

Health Check Endpoints

Accessing Health Status

Health Response Format

Info Endpoint

Distributed Tracing with Zipkin

Zipkin Configuration

Understanding Sampling Probability

Accessing Zipkin Dashboard

Trace Search and Analysis

Example Trace Flow

Monitoring Config Server

Config Server Health

Monitoring Configuration Retrieval

Service-Level Monitoring

Key Metrics by Service

Logging Configuration

Production Logging

Development Debugging

Automated Monitoring Setup

Docker Compose Health Checks

Kubernetes Probes

Monitoring Best Practices

Troubleshooting Common Issues

Service Reports DOWN

High Latency in Traces

Missing Traces

Related Resources