Skip to main content

Overview

SGIVU microservices use Spring Boot Actuator for health checks and metrics, combined with Zipkin for distributed tracing. This guide shows how to monitor the Config Server and microservices.

Spring Boot Actuator Configuration

All microservices expose actuator endpoints for health monitoring and operational insights.

Production Configuration

In production profiles, services expose minimal endpoints for security:
management:
  endpoints:
    web:
      exposure:
        include: health, info
  endpoint:
    health:
      show-details: never
The show-details: never setting prevents sensitive information from being exposed in health endpoint responses.

Development Configuration

Development environments expose all actuator endpoints for debugging:
management:
  endpoints:
    web:
      exposure:
        include: "*"
  endpoint:
    health:
      show-details: always
Never use include: "*" in production. It exposes sensitive endpoints like /actuator/env, /actuator/beans, and /actuator/mappings.

Health Check Endpoints

Accessing Health Status

Each microservice exposes a health check endpoint:

Config Server

curl http://localhost:8888/actuator/health

Discovery Server

curl http://localhost:8761/actuator/health

API Gateway

curl http://localhost:8080/actuator/health

Auth Service

curl http://localhost:9000/actuator/health

Health Response Format

Production (show-details: never):
{
  "status": "UP"
}
Development (show-details: always):
{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "details": {
        "database": "PostgreSQL",
        "validationQuery": "isValid()"
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 250685575168,
        "free": 150685575168,
        "threshold": 10485760
      }
    },
    "ping": {
      "status": "UP"
    },
    "redis": {
      "status": "UP",
      "details": {
        "version": "7.0.5"
      }
    }
  }
}
Health status can be UP, DOWN, OUT_OF_SERVICE, or UNKNOWN. Services should be monitored for non-UP statuses.

Info Endpoint

The /actuator/info endpoint provides application metadata:
curl http://localhost:8888/actuator/info
You can customize this endpoint by adding properties:
info:
  app:
    name: ${spring.application.name}
    version: @project.version@
    environment: ${spring.profiles.active}

Distributed Tracing with Zipkin

Zipkin Configuration

All microservices are configured to send trace data to Zipkin:
management:
  tracing:
    sampling:
      probability: 0.1
  zipkin:
    tracing:
      endpoint: http://sgivu-zipkin:9411/api/v2/spans

Understanding Sampling Probability

The probability setting controls what percentage of requests are traced:

0.1 (10%)

Production default - traces 1 in 10 requests to reduce overhead

0.5 (50%)

Higher sampling for debugging issues in staging

1.0 (100%)

Development - trace every request (high overhead)

Accessing Zipkin Dashboard

Once Zipkin is running, access the dashboard:
open http://localhost:9411

Trace Search and Analysis

1

Find Traces

Search by service name, time range, or trace ID in the Zipkin UI.
2

View Trace Timeline

Click a trace to see the request flow across microservices with timing information.
3

Identify Bottlenecks

Look for spans with long durations to identify performance issues.
4

Analyze Dependencies

Use the Dependencies tab to visualize service communication patterns.

Example Trace Flow

A typical request through SGIVU might create this trace:
1. sgivu-gateway        [200ms] - Client request received
2. └─ sgivu-auth        [50ms]  - Token validation
3.    └─ sgivu-user     [100ms] - Fetch user data
4.       └─ PostgreSQL  [30ms]  - Database query

Monitoring Config Server

Config Server Health

The Config Server health includes Git repository connectivity:
curl http://localhost:8888/actuator/health
Response includes:
{
  "status": "UP",
  "components": {
    "configServer": {
      "status": "UP",
      "details": {
        "repositories": [
          {
            "name": "sgivu-config-repo",
            "profiles": ["dev", "prod"],
            "label": "main"
          }
        ]
      }
    }
  }
}
If Config Server health is DOWN, microservices cannot fetch configuration and may fail to start.

Monitoring Configuration Retrieval

Test that services can retrieve their configuration:
# Fetch auth service dev config
curl http://localhost:8888/sgivu-auth/dev

# Fetch gateway service prod config
curl http://localhost:8888/sgivu-gateway/prod

# Fetch user service default config
curl http://localhost:8888/sgivu-user/default
Successful responses indicate the Config Server is correctly serving configuration.

Service-Level Monitoring

Key Metrics by Service

Key Metrics:
  • OAuth2 token generation rate
  • JWT signing operations
  • Session creation/validation
  • Database connection pool usage
Health Dependencies:
  • PostgreSQL database
  • JWT keystore availability
  • Eureka registration
Key Metrics:
  • Request throughput and latency
  • OAuth2 token relay success rate
  • Redis session operations
  • Circuit breaker status
Health Dependencies:
  • Redis connection
  • Auth service availability
  • Eureka service discovery
Key Metrics:
  • User lookup latency
  • Database query performance
  • Internal API call success rate
Health Dependencies:
  • PostgreSQL database
  • Service internal authentication
  • Eureka registration
Key Metrics:
  • Client data operations
  • Database transaction rate
  • API response times
Health Dependencies:
  • PostgreSQL database
  • Service internal authentication
  • Eureka registration
Key Metrics:
  • Vehicle image upload success rate
  • S3 operation latency
  • File size and multipart upload metrics
Health Dependencies:
  • PostgreSQL database
  • AWS S3 connectivity
  • Service internal authentication

Logging Configuration

Production Logging

Production services use INFO level logging:
logging:
  level:
    root: INFO

Development Debugging

Development profiles enable detailed logging for troubleshooting:
logging:
  level:
    com.sgivu.gateway.security: DEBUG
    com.sgivu.gateway.controller: DEBUG
    org.springframework.security.oauth2.client: DEBUG
    org.springframework.security.web.server.authentication: DEBUG
    org.springframework.session.web.server: DEBUG
Debug logging is useful for tracing OAuth2 flows, session management, and security filters during development.

Automated Monitoring Setup

Docker Compose Health Checks

Add health checks to your docker-compose.yml:
services:
  sgivu-config:
    image: sgivu/config-server:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8888/actuator/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Kubernetes Probes

For Kubernetes deployments:
apiVersion: v1
kind: Pod
metadata:
  name: sgivu-auth
spec:
  containers:
  - name: auth
    image: sgivu/auth-service:latest
    livenessProbe:
      httpGet:
        path: /actuator/health
        port: 9000
      initialDelaySeconds: 60
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /actuator/health
        port: 9000
      initialDelaySeconds: 30
      periodSeconds: 5

Monitoring Best Practices

Set Up Alerts

Configure alerts for DOWN health status, high error rates, or slow response times.

Monitor Dependencies

Track database connections, Redis availability, and Eureka registration status.

Track Trends

Collect metrics over time to identify gradual degradation or capacity issues.

Correlate Logs

Use trace IDs to correlate logs across services when debugging issues.

Troubleshooting Common Issues

Service Reports DOWN

1

Check Actuator Health

curl http://localhost:8080/actuator/health
Look at component status to identify which dependency is failing.
2

Verify Dependencies

Check that databases, Redis, and other services are running and accessible.
3

Review Logs

Look for connection errors, timeouts, or authentication failures in application logs.
4

Test Config Retrieval

Ensure the service can fetch its configuration from Config Server.

High Latency in Traces

  1. Identify the slow span in Zipkin
  2. Check database query performance if the slow span is a repository call
  3. Review network latency between services
  4. Check resource utilization (CPU, memory, disk I/O)

Missing Traces

  • Verify Zipkin is running and accessible
  • Check that management.zipkin.tracing.endpoint is correctly configured
  • Increase sampling.probability temporarily for debugging
  • Review service logs for Zipkin connection errors

Configuration Refresh

Update monitoring configuration without restarts

Security

Secure actuator endpoints in production

Validation

Validate configuration changes before deployment

Troubleshooting

Debug common configuration issues

Build docs developers (and LLMs) love