Skip to main content

Overview

The Gateway service is a Go-based microservice that demonstrates advanced resilience patterns including circuit breakers and retry budgets. It proxies requests to the Custom Language Service and handles failures gracefully with automatic recovery mechanisms.
Source code: services/internal/gateway/service.goEntry point: services/cmd/gateway/main.go

Technology Stack

  • Language: Go
  • Framework: Connect-Go (gRPC-compatible)
  • Database: PostgreSQL via pgx/v5
  • Circuit Breaker: gobreaker/v2
  • Protocol: Protocol Buffers
  • Observability: OpenTelemetry
  • HTTP Version: HTTP/2 (h2c)

Configuration

Environment Variables

PORT
string
default:"8082"
HTTP server port for the gateway service
CUSTOM_LANG_BASE_URL
string
default:"http://custom-lang-service.microservices:3000"
Base URL for the Custom Language ServiceDocker Compose uses: http://custom-lang-service:3000
DATABASE_URL
string
required
PostgreSQL connection stringExample: postgresql://devuser:devpass@postgres:5432/gateway_db
OTEL_EXPORTER_OTLP_ENDPOINT
string
OpenTelemetry collector endpoint
OTEL_SERVICE_NAME
string
default:"gateway-service"
Service name for distributed tracing

Docker Compose Configuration

gateway:
  build:
    context: .
    dockerfile: deploy/docker/gateway/Dockerfile
  environment:
    PORT: "8082"
    CUSTOM_LANG_BASE_URL: "http://custom-lang-service:3000"
    DATABASE_URL: "postgresql://devuser:devpass@postgres:5432/gateway_db"
    OTEL_SERVICE_NAME: "gateway-service"
  labels:
    - "traefik.enable=true"
    - "traefik.http.routers.gateway.rule=PathPrefix(`/gateway.v1.GatewayService`)"
    - "traefik.http.services.gateway.loadbalancer.server.scheme=h2c"
  depends_on:
    - custom-lang-service
    - postgres

API Reference

Protocol Buffer Definition

The service is defined in proto/gateway/v1/gateway.proto:
syntax = "proto3";

package gateway.v1;

service GatewayService {
  rpc InvokeCustom(InvokeCustomRequest) returns (InvokeCustomResponse) {}
}

message InvokeCustomRequest {
  string name = 1;
}

message InvokeCustomResponse {
  string message = 1;
}

InvokeCustom RPC

Invokes the Custom Language Service with circuit breaker and retry protection.
name
string
default:"World"
Name to pass to the custom language serviceSpecial values trigger error responses:
  • "unauthorized" → 401
  • "forbidden" → 403
  • "notfound" → 404
  • "conflict" → 409
  • "ratelimit" → 429
  • "unavailable" → 503
message
string
Response message from the custom language serviceFormat: "Hello {name} from custom-lang-service!"

Example Request

# Using grpcurl
grpcurl -plaintext -d '{"name": "Alice"}' \
  localhost:8082 \
  gateway.v1.GatewayService/InvokeCustom

Example Response

{
  "message": "Hello Alice from custom-lang-service!"
}

Resilience Patterns

Circuit Breaker

The Gateway service uses the gobreaker library to prevent cascading failures. From services/internal/gateway/service.go:79-91:
breaker := gobreaker.NewCircuitBreaker[invokeResult](gobreaker.Settings{
    Name:        "custom-lang-service",
    MaxRequests: 3,
    Interval:    10 * time.Second,
    Timeout:     30 * time.Second,
    ReadyToTrip: func(counts gobreaker.Counts) bool {
        return counts.ConsecutiveFailures >= 5
    },
    IsSuccessful: func(err error) bool {
        return err == nil || errors.Is(err, context.Canceled)
    },
})

Circuit Breaker Configuration

MaxRequests
int
default:"3"
Maximum requests allowed in half-open stateWhen the circuit transitions from open to half-open, only 3 requests are allowed through to test if the downstream service has recovered.
Interval
duration
default:"10s"
Time window for counting failuresFailure counts reset after this interval.
Timeout
duration
default:"30s"
Time to wait before transitioning from open to half-openAfter 30 seconds in open state, the circuit allows limited requests through to test recovery.
ReadyToTrip
function
Condition to open the circuitCircuit opens after 5 consecutive failures.

Circuit Breaker States

Closed: Normal operation, all requests pass through Open: Circuit is tripped, all requests fail immediately with Code.UNAVAILABLE Half-Open: Testing recovery, allows limited requests (max 3)

Retry Budget

The retry budget prevents retry storms by limiting the number of concurrent retries. From services/internal/gateway/service.go:46-77:
type RetryBudget struct {
    tokens chan struct{}
}

func NewRetryBudget(capacity, refillPerSecond int) *RetryBudget {
    b := &RetryBudget{tokens: make(chan struct{}, capacity)}
    for i := 0; i < capacity; i++ {
        b.tokens <- struct{}{}
    }
    go func() {
        ticker := time.NewTicker(time.Second)
        defer ticker.Stop()
        for range ticker.C {
            for i := 0; i < refillPerSecond; i++ {
                select {
                case b.tokens <- struct{}{}:
                default:
                }
            }
        }
    }()
    return b
}

func (b *RetryBudget) Allow() bool {
    select {
    case <-b.tokens:
        return true
    default:
        return false
    }
}

Retry Budget Configuration

  • Capacity: 20 tokens
  • Refill Rate: 10 tokens per second
  • Algorithm: Token bucket
Behavior: If the token bucket is empty, retries are skipped even if the request is retryable.

Retry Logic

From services/internal/gateway/service.go:103-126:
result, err := s.breaker.Execute(func() (invokeResult, error) {
    return s.callCustom(ctx, name)
})
if err != nil && shouldRetry(err) && s.retryBudget.Allow() {
    result, err = s.breaker.Execute(func() (invokeResult, error) {
        return s.callCustom(ctx, name)
    })
}
Retries are only attempted if:
  1. The initial request failed
  2. The error is retryable (429, 502, 503, 504, or network timeout)
  3. The retry budget allows it

Retryable Errors

From services/internal/gateway/service.go:181-199:
func shouldRetry(err error) bool {
    var se *statusError
    if errors.As(err, &se) {
        switch se.status {
        case http.StatusTooManyRequests,        // 429
             http.StatusBadGateway,              // 502
             http.StatusServiceUnavailable,      // 503
             http.StatusGatewayTimeout:          // 504
            return true
        default:
            return false
        }
    }
    if errors.Is(err, context.DeadlineExceeded) {
        return false  // Don't retry on timeout
    }
    var netErr net.Error
    if errors.As(err, &netErr) {
        return netErr.Timeout()
    }
    return false
}

Implementation Details

Service Structure

From services/internal/gateway/service.go:24-31:
type Service struct {
    httpClient  *http.Client
    baseURL     string
    timeout     time.Duration
    breaker     *gobreaker.CircuitBreaker[invokeResult]
    retryBudget *RetryBudget
    pool        *pgxpool.Pool
}

Request Flow

  1. Receive Request: Accept name parameter (defaults to “World”)
  2. Circuit Breaker Check: Verify circuit is not open
  3. Execute Request: Call custom language service via circuit breaker
  4. Retry Logic: Attempt retry if error is retryable and budget allows
  5. Database Write: Synchronously record success or failure
  6. Return Response: Send result or error to client
From services/internal/gateway/service.go:103-137:
func (s *Service) InvokeCustom(
    ctx context.Context, 
    req *connect.Request[gatewayv1.InvokeCustomRequest],
) (*connect.Response[gatewayv1.InvokeCustomResponse], error) {
    name := req.Msg.GetName()
    if name == "" {
        name = "World"
    }

    result, err := s.breaker.Execute(func() (invokeResult, error) {
        return s.callCustom(ctx, name)
    })
    if err != nil && shouldRetry(err) && s.retryBudget.Allow() {
        result, err = s.breaker.Execute(func() (invokeResult, error) {
            return s.callCustom(ctx, name)
        })
    }
    if err != nil {
        // Synchronous DB write for failures
        if s.pool != nil {
            _, dbErr := s.pool.Exec(ctx, 
                "INSERT INTO invocations (name, result_message, success) VALUES ($1, $2, $3)", 
                name, err.Error(), false)
            if dbErr != nil {
                slog.Error("failed to insert invocation", "error", dbErr)
            }
        }
        return nil, mapError(err)
    }

    // Synchronous DB write for success
    if s.pool != nil {
        _, dbErr := s.pool.Exec(ctx, 
            "INSERT INTO invocations (name, result_message, success) VALUES ($1, $2, $3)", 
            name, result.message, true)
        if dbErr != nil {
            slog.Error("failed to insert invocation", "error", dbErr)
        }
    }

    return connect.NewResponse(&gatewayv1.InvokeCustomResponse{Message: result.message}), nil
}

HTTP Call Implementation

From services/internal/gateway/service.go:139-179:
func (s *Service) callCustom(ctx context.Context, name string) (invokeResult, error) {
    payload, err := json.Marshal(map[string]string{"name": name})
    if err != nil {
        return invokeResult{}, err
    }

    rpcCtx, cancel := context.WithTimeout(ctx, s.timeout)
    defer cancel()

    httpReq, err := http.NewRequestWithContext(
        rpcCtx, http.MethodPost, s.baseURL+"/invoke", bytes.NewReader(payload))
    if err != nil {
        return invokeResult{}, err
    }
    httpReq.Header.Set("content-type", "application/json")

    httpResp, err := s.httpClient.Do(httpReq)
    if err != nil {
        return invokeResult{}, err
    }
    defer httpResp.Body.Close()

    body, err := io.ReadAll(httpResp.Body)
    if err != nil {
        return invokeResult{}, err
    }

    if httpResp.StatusCode >= 400 {
        return invokeResult{}, &statusError{
            status: httpResp.StatusCode, 
            body:   string(body),
        }
    }

    var resp struct {
        Message string `json:"message"`
    }
    if err := json.Unmarshal(body, &resp); err != nil {
        return invokeResult{}, err
    }
    if resp.Message == "" {
        resp.Message = "custom-lang-service returned an empty message"
    }
    return invokeResult{message: resp.Message}, nil
}

Timeout Configuration

  • HTTP Call Timeout: 1 second per request
  • HTTP Client Timeout: 2 seconds total
  • Server Read Timeout: 5 seconds
  • Server Write Timeout: 30 seconds

Database Schema

The service uses an invocations table to track all requests:
CREATE TABLE invocations (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    result_message TEXT,
    success BOOLEAN NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

Database Pattern

The Gateway service uses a synchronous write pattern for both successes and failures:
  • Guarantees all invocations are recorded
  • Records failure details for debugging
  • Adds ~1-5ms latency to responses
  • Helps track circuit breaker behavior

Error Handling

Error Mapping

From services/internal/gateway/service.go:201-217:
func mapError(err error) error {
    if errors.Is(err, gobreaker.ErrOpenState) {
        return connect.NewError(connect.CodeUnavailable, err)
    }
    if errors.Is(err, context.DeadlineExceeded) {
        return connect.NewError(connect.CodeDeadlineExceeded, err)
    }
    var netErr net.Error
    if errors.As(err, &netErr) && netErr.Timeout() {
        return connect.NewError(connect.CodeDeadlineExceeded, err)
    }
    var se *statusError
    if errors.As(err, &se) {
        return connect.NewError(MapHTTPStatusToConnectCode(se.status), err)
    }
    return connect.NewError(connect.CodeInternal, err)
}

HTTP Status to gRPC Code Mapping

From services/internal/gateway/service.go:219-243:
func MapHTTPStatusToConnectCode(status int) connect.Code {
    switch status {
    case http.StatusBadRequest:           // 400
        return connect.CodeInvalidArgument
    case http.StatusUnauthorized:         // 401
        return connect.CodeUnauthenticated
    case http.StatusForbidden:            // 403
        return connect.CodePermissionDenied
    case http.StatusNotFound:             // 404
        return connect.CodeNotFound
    case http.StatusConflict:             // 409
        return connect.CodeAlreadyExists
    case http.StatusTooManyRequests:      // 429
        return connect.CodeResourceExhausted
    case http.StatusBadGateway,           // 502
         http.StatusServiceUnavailable:   // 503
        return connect.CodeUnavailable
    case http.StatusGatewayTimeout:       // 504
        return connect.CodeDeadlineExceeded
    default:
        if status >= 500 {
            return connect.CodeInternal
        }
        return connect.CodeUnknown
    }
}

Service Dependencies

Upstream Dependencies

  • Custom Language Service: Required for invocation logic
  • PostgreSQL: Optional, service runs without DB but won’t persist logs

Downstream Consumers

  • Frontend: Via Traefik gateway
  • Direct gRPC Clients: Any Connect-compatible client

Testing

Test the service using grpcurl:
# Normal request
grpcurl -plaintext -d '{"name": "Alice"}' \
  localhost:8082 \
  gateway.v1.GatewayService/InvokeCustom

# Test error handling
grpcurl -plaintext -d '{"name": "unavailable"}' \
  localhost:8082 \
  gateway.v1.GatewayService/InvokeCustom

# Trigger circuit breaker (run 5+ times rapidly)
for i in {1..10}; do
  grpcurl -plaintext -d '{"name": "unavailable"}' \
    localhost:8082 \
    gateway.v1.GatewayService/InvokeCustom
done

# Health check
curl http://localhost:8082/healthz

Observability

Structured Logging

The service logs circuit breaker state changes and retry attempts:
slog.Error("failed to insert invocation", "error", dbErr)

Distributed Tracing

OpenTelemetry captures:
  • Circuit breaker executions
  • Retry attempts
  • HTTP client requests with otelhttp instrumentation
  • Database operations

Metrics

Circuit breaker metrics:
  • State transitions (closed → open → half-open)
  • Success/failure counts
  • Request counts per state

Performance Characteristics

  • Latency: ~5-15ms (with custom-lang-service)
  • Latency (circuit open): Less than 1ms (fail fast)
  • Throughput: Limited by retry budget (20 tokens, +10/sec)
  • Database Write: Synchronous, adds ~1-5ms
  • Timeout Budget: 1 second per attempt

Common Issues

Circuit Breaker Open

Error: Code.UNAVAILABLE: circuit breaker is open Cause: 5 consecutive failures to Custom Language Service Solution:
  • Wait 30 seconds for circuit to enter half-open state
  • Fix Custom Language Service if it’s down
  • Check error logs for underlying failure cause

Retry Budget Exhausted

Symptom: No retry attempts even for retryable errors Cause: Token bucket is empty (>20 retries in last 2 seconds) Solution: Wait for tokens to refill (10 per second) or reduce error rate

Custom Language Service Unavailable

Error: Code.UNAVAILABLE: call failed Solution: Ensure Custom Language Service is running at CUSTOM_LANG_BASE_URL

Custom Language Service

Downstream service that Gateway proxies to

Services Overview

Learn about resilience patterns

Build docs developers (and LLMs) love