Recovery Middleware

Overview

The recovery middleware catches panics that occur during request processing and converts them into proper error responses. This prevents the entire service from crashing when a panic occurs in a handler.

Installation

go get github.com/go-kratos/kratos/v2/middleware/recovery

Basic Usage

The Recovery function creates a panic recovery middleware:

func Recovery(opts ...Option) middleware.Middleware

Server Example

import (
    "github.com/go-kratos/kratos/v2"
    "github.com/go-kratos/kratos/v2/middleware/recovery"
    "github.com/go-kratos/kratos/v2/transport/http"
    "github.com/go-kratos/kratos/v2/transport/grpc"
)

func main() {
    // Create HTTP server with recovery
    httpSrv := http.NewServer(
        http.Address(":8000"),
        http.Middleware(
            recovery.Recovery(),
        ),
    )
    
    // Create gRPC server with recovery
    grpcSrv := grpc.NewServer(
        grpc.Address(":9000"),
        grpc.Middleware(
            recovery.Recovery(),
        ),
    )
    
    app := kratos.New(
        kratos.Server(httpSrv, grpcSrv),
    )
    
    if err := app.Run(); err != nil {
        log.Fatal(err)
    }
}

Always place the recovery middleware first in the middleware chain to ensure it catches panics from all other middleware and handlers.

http.Middleware(
    recovery.Recovery(),    // Must be first!
    tracing.Server(),
    logging.Server(logger),
    metrics.Server(),
)

Default Behavior

When a panic occurs, the middleware:

Recovers from the panic
Logs the panic with stack trace
Records the latency in context
Returns an ErrUnknownRequest error

Default Error

var ErrUnknownRequest = errors.InternalServer("UNKNOWN", "unknown request error")

This returns:

HTTP status: 500 Internal Server Error
gRPC code: INTERNAL

Custom Recovery Handler

You can provide a custom handler to process panics:

type HandlerFunc func(ctx context.Context, req, err any) error

WithHandler Option

func WithHandler(h HandlerFunc) Option

Usage Example

import (
    "context"
    "fmt"
    
    "github.com/go-kratos/kratos/v2/errors"
    "github.com/go-kratos/kratos/v2/log"
    "github.com/go-kratos/kratos/v2/middleware/recovery"
)

// Custom recovery handler
func customRecoveryHandler(ctx context.Context, req, err any) error {
    // Log the panic
    log.Context(ctx).Errorf("Panic recovered: %v", err)
    
    // Get latency from context
    if latency, ok := ctx.Value(recovery.Latency{}).(float64); ok {
        log.Context(ctx).Infof("Request latency: %.3fs", latency)
    }
    
    // Return custom error based on panic type
    switch e := err.(type) {
    case string:
        if e == "not found" {
            return errors.NotFound("RESOURCE_NOT_FOUND", "resource not found")
        }
        return errors.InternalServer("PANIC", fmt.Sprintf("panic: %s", e))
    case error:
        return errors.InternalServer("PANIC", e.Error())
    default:
        return errors.InternalServer("PANIC", fmt.Sprintf("%v", e))
    }
}

// Use custom handler
recovery.Recovery(
    recovery.WithHandler(customRecoveryHandler),
)

Latency Context

The middleware stores request latency in the context when a panic occurs:

type Latency struct{}

You can retrieve the latency in your custom handler:

func customHandler(ctx context.Context, req, err any) error {
    // Get latency
    if latency, ok := ctx.Value(recovery.Latency{}).(float64); ok {
        fmt.Printf("Request took %.3f seconds before panic\n", latency)
    }
    
    return errors.InternalServer("PANIC", "panic occurred")
}

Stack Trace Logging

The middleware automatically logs the full stack trace when a panic occurs:

log.Context(ctx).Errorf("%v: %+v\n%s\n", rerr, req, buf)

Example Log Output

ERROR: runtime error: index out of range [5] with length 3: {"id":"123"}
goroutine 123 [running]:
github.com/go-kratos/kratos/v2/middleware/recovery.Recovery.func1.1()
    /go/pkg/mod/github.com/go-kratos/kratos/v2/middleware/recovery/recovery.go:54
panic({0x1234567, 0xc0001a4000})
    /usr/local/go/src/runtime/panic.go:890
main.(*service).GetUser(...)
    /app/service/user.go:42
...

Complete Example

package main

import (
    "context"
    "fmt"
    "log"
    
    "github.com/go-kratos/kratos/v2"
    "github.com/go-kratos/kratos/v2/errors"
    kratoslog "github.com/go-kratos/kratos/v2/log"
    "github.com/go-kratos/kratos/v2/middleware/logging"
    "github.com/go-kratos/kratos/v2/middleware/recovery"
    "github.com/go-kratos/kratos/v2/transport/http"
)

// Custom panic handler with detailed logging
func panicHandler(logger kratoslog.Logger) recovery.HandlerFunc {
    return func(ctx context.Context, req, err any) error {
        // Log panic details
        helper := kratoslog.NewHelper(logger)
        
        // Get latency
        var latency float64
        if v, ok := ctx.Value(recovery.Latency{}).(float64); ok {
            latency = v
        }
        
        // Log with context
        helper.Errorw(
            "kind", "panic",
            "panic", fmt.Sprintf("%v", err),
            "request", fmt.Sprintf("%+v", req),
            "latency", latency,
        )
        
        // Send alert (e.g., to Sentry, PagerDuty)
        // sendAlert(ctx, err)
        
        // Return appropriate error
        return errors.InternalServer(
            "INTERNAL_ERROR",
            "An internal error occurred. Please try again later.",
        )
    }
}

func main() {
    // Create logger
    logger := kratoslog.NewStdLogger(os.Stdout)
    
    // Create HTTP server with recovery
    httpSrv := http.NewServer(
        http.Address(":8000"),
        http.Middleware(
            recovery.Recovery(
                recovery.WithHandler(panicHandler(logger)),
            ),
            logging.Server(logger),
        ),
    )
    
    app := kratos.New(
        kratos.Name("recovery-example"),
        kratos.Logger(logger),
        kratos.Server(httpSrv),
    )
    
    if err := app.Run(); err != nil {
        log.Fatal(err)
    }
}

Testing Recovery Middleware

package main

import (
    "context"
    "testing"
    
    "github.com/go-kratos/kratos/v2/middleware"
    "github.com/go-kratos/kratos/v2/middleware/recovery"
)

func TestRecovery(t *testing.T) {
    // Handler that panics
    panicHandler := func(ctx context.Context, req any) (any, error) {
        panic("test panic")
    }
    
    // Apply recovery middleware
    handler := recovery.Recovery()(panicHandler)
    
    // Call handler
    reply, err := handler(context.Background(), "test")
    
    // Should not panic, should return error
    if err == nil {
        t.Error("expected error, got nil")
    }
    if reply != nil {
        t.Errorf("expected nil reply, got %v", reply)
    }
}

func TestCustomHandler(t *testing.T) {
    customErr := errors.BadRequest("CUSTOM", "custom error")
    
    // Custom handler
    customHandler := func(ctx context.Context, req, err any) error {
        return customErr
    }
    
    // Handler that panics
    panicHandler := func(ctx context.Context, req any) (any, error) {
        panic("test panic")
    }
    
    // Apply recovery with custom handler
    handler := recovery.Recovery(
        recovery.WithHandler(customHandler),
    )(panicHandler)
    
    // Call handler
    _, err := handler(context.Background(), "test")
    
    // Should return custom error
    if err != customErr {
        t.Errorf("expected custom error, got %v", err)
    }
}

Best Practices

Always Use Recovery Middleware

Every production service should use recovery middleware to prevent crashes. Place it first in the middleware chain.

http.Middleware(
    recovery.Recovery(), // First!
    // ... other middleware
)

Monitor and Alert on Panics

Implement a custom handler that sends alerts when panics occur. This helps you identify and fix bugs quickly.

func alertingHandler(ctx context.Context, req, err any) error {
    sendToSentry(err)
    notifyPagerDuty("panic occurred")
    return errors.InternalServer("PANIC", "internal error")
}

Log Full Context

Always log the request, panic value, stack trace, and any relevant context when a panic occurs.

Don't Expose Stack Traces

Never return stack traces or internal details to clients. Use generic error messages.

// Good
return errors.InternalServer("INTERNAL", "an error occurred")

// Bad
return errors.InternalServer("PANIC", fmt.Sprintf("%+v", panicValue))

Track Metrics

Track panic frequency and rate to identify problematic code paths.

func metricsHandler(ctx context.Context, req, err any) error {
    panicCounter.Inc()
    return errors.InternalServer("PANIC", "internal error")
}

Test Panic Scenarios

Write tests that verify your service handles panics gracefully.

Common Panic Scenarios

The recovery middleware helps protect against common panic scenarios:

Nil Pointer Dereference

func (s *service) GetUser(ctx context.Context, req *pb.GetUserRequest) (*pb.User, error) {
    user := s.findUser(req.Id) // May return nil
    return user, nil             // Panics if user.Name accessed
}

Index Out of Range

func (s *service) GetItem(ctx context.Context, req *pb.GetItemRequest) (*pb.Item, error) {
    items := s.getItems()
    return items[req.Index], nil // Panics if index >= len(items)
}

Type Assertion Failure

func (s *service) Process(ctx context.Context, req any) (any, error) {
    data := req.(string) // Panics if req is not a string
    return process(data), nil
}

Channel Operations

func (s *service) Send(ctx context.Context, msg string) error {
    s.ch <- msg // Panics if channel is closed
    return nil
}

Source Reference

The recovery middleware implementation can be found in:

middleware/recovery/recovery.go:36 - Recovery middleware
middleware/recovery/recovery.go:19 - HandlerFunc type
middleware/recovery/recovery.go:13 - Latency context key
middleware/recovery/recovery.go:16 - ErrUnknownRequest

Next Steps

Logging

Add logging to track panic details

Metrics

Monitor service reliability

Get Started

Core Concepts

Components

Middleware

Guides

CLI Reference

Recovery Middleware

Overview

Installation

Basic Usage

Server Example

Default Behavior

Default Error

Custom Recovery Handler

WithHandler Option

Usage Example

Latency Context

Stack Trace Logging

Example Log Output

Complete Example

Testing Recovery Middleware

Best Practices

Common Panic Scenarios

Nil Pointer Dereference

Index Out of Range

Type Assertion Failure

Channel Operations

Source Reference

Next Steps

Logging

Metrics

Build docs developers (and LLMs) love

Get Started

Core Concepts

Components

Middleware

Guides

CLI Reference

​Overview

​Installation

​Basic Usage

​Server Example

​Default Behavior

​Default Error

​Custom Recovery Handler

​WithHandler Option

​Usage Example

​Latency Context

​Stack Trace Logging

​Example Log Output

​Complete Example

​Testing Recovery Middleware

​Best Practices

​Common Panic Scenarios

​Nil Pointer Dereference

​Index Out of Range

​Type Assertion Failure

​Channel Operations

​Source Reference

​Next Steps

Logging

Metrics

Build docs developers (and LLMs) love

Overview

Installation

Basic Usage

Server Example

Default Behavior

Default Error

Custom Recovery Handler

WithHandler Option

Usage Example

Latency Context

Stack Trace Logging

Example Log Output

Complete Example

Testing Recovery Middleware

Best Practices

Common Panic Scenarios

Nil Pointer Dereference

Index Out of Range

Type Assertion Failure

Channel Operations

Source Reference

Next Steps