Performance Tuning

Overview

Permission Mongo is optimized for 50K+ QPS workloads through careful tuning of connection pools, caching strategies, and lock-free data structures. This guide covers performance optimization techniques and benchmarking.

Connection Pooling

MongoDB Connection Pool

The MongoDB driver maintains a connection pool for efficient resource utilization:

// Default MongoDB pool configuration
opts := options.Client().
    ApplyURI(uri).
    SetMaxPoolSize(100).
    SetMinPoolSize(10).
    SetServerSelectionTimeout(5 * time.Second).
    SetConnectTimeout(10 * time.Second)

Key Parameters:

Parameter	Default	Description
`MaxPoolSize`	100	Maximum concurrent connections
`MinPoolSize`	10	Minimum idle connections
`ServerSelectionTimeout`	5s	Timeout for server selection
`ConnectTimeout`	10s	Initial connection timeout

Tuning Guidelines:

High read workload: Increase MaxPoolSize to 200-300
Low latency requirement: Keep MinPoolSize high (50-100) to avoid connection warm-up
Distributed deployment: Set ServerSelectionTimeout to 2-3s for faster failover

# config.yaml
mongodb:
  uri: "mongodb://localhost:27017"
  database: "permission_mongo"
  max_pool_size: 200
  min_pool_size: 50

Redis Connection Pool

Redis caching layer is configured for high throughput:

// Default Redis pool settings (pkg/cache/redis.go:33-42)
const (
    DefaultPoolSize     = 500
    DefaultMinIdleConns = 50
    DefaultPoolTimeout  = 2 * time.Second
    DefaultReadTimeout  = 1 * time.Second
    DefaultWriteTimeout = 1 * time.Second
    DefaultDialTimeout  = 2 * time.Second
)

Pool Configuration:

opts := &redis.Options{
    Addr:         "localhost:6379",
    PoolSize:     500,     // Max active connections
    MinIdleConns: 50,      // Keep-alive connections
    PoolTimeout:  2 * time.Second,
    ReadTimeout:  1 * time.Second,
    WriteTimeout: 1 * time.Second,
}

Tuning for Scale:

50K+ QPS: Use PoolSize: 500-1000
Low latency: Set ReadTimeout and WriteTimeout to 500ms
High write volume: Increase WriteTimeout to 2-3s

# config.yaml
redis:
  url: "localhost:6379"
  pool_size: 1000
  min_idle_conns: 100
  read_timeout: 500ms
  write_timeout: 1s

Lock-Free Optimizations

AST Caching with sync.Map

RBAC expression compilation uses a thread-safe cache to avoid re-parsing:

// pkg/rbac/compiler.go:36-38
type Compiler struct {
    astCache sync.Map // map[string]Expression - thread-safe
}

Benefits:

Zero lock contention for reads
Amortized parsing cost across requests
Memory efficient - caches only unique expressions

Cache Statistics:

// Monitor cache effectiveness
compiler := rbac.NewCompiler()
size := compiler.CacheSize()  // Number of cached expressions
compiler.ClearCache()          // Clear on policy reload

Atomic Operations for Metrics

Prometheus metrics use atomic counters for zero-overhead instrumentation:

// Lock-free metric updates
metrics.MongoOperationsTotal.WithLabelValues(collection, "insert").Inc()
metrics.CacheHits.WithLabelValues("policy").Inc()

Async Audit Logging

Audit logs are batched and written asynchronously to avoid blocking request paths: Architecture:

Request → Handler → [Async Channel] → Batch Writer → MongoDB
                              ↓
                         (Non-blocking)

Configuration:

audit:
  enabled: true
  collection: "audit_logs"
  batch_size: 100        # Batch writes
  flush_interval: 5s     # Max wait time
  buffer_size: 10000     # Channel capacity

Tuning:

High throughput: Increase batch_size to 500-1000
Low latency requirement: Reduce flush_interval to 1-2s
Memory constrained: Reduce buffer_size to 5000

HTTP Server Tuning

Permission Mongo uses optimized HTTP timeouts:

server:
  host: "0.0.0.0"
  port: 8080
  read_timeout: 30s      # Request read timeout
  write_timeout: 30s     # Response write timeout
  idle_timeout: 60s      # Keep-alive timeout
  max_header_bytes: 1048576  # 1MB header limit

Fasthttp Alternative (mentioned in README:213): For ultra-high throughput (100K+ QPS), consider fasthttp:

// Fasthttp tuned for 256K concurrent connections
server := &fasthttp.Server{
    Concurrency: 256000,
    ReadTimeout: 10 * time.Second,
    WriteTimeout: 10 * time.Second,
}

Benchmarks

Throughput Targets

Metric	Target	Notes
QPS	50,000+	With caching enabled
P50 Latency	Less than 5ms	Cache hit scenario
P99 Latency	Less than 50ms	Including RBAC evaluation
Cache Hit Rate	Greater than 90%	Policy and hierarchy

Load Testing

Setup:

# Use vegeta for load testing
echo "GET http://localhost:8080/orders" | \
  vegeta attack -rate=50000 -duration=60s | \
  vegeta report

Expected Results (with optimized config):

Requests      [total, rate, throughput]  3000000, 50000.00, 49800.00
Duration      [total, attack, wait]      60.2s, 60s, 200ms
Latencies     [mean, 50, 95, 99, max]    4.2ms, 3.8ms, 12ms, 45ms, 150ms
Success       [ratio]                    99.95%

Monitoring Performance

Key Metrics

Track these Prometheus metrics:

# Request rate
rate(permission_mongo_http_requests_total[5m])

# Cache hit ratio
sum(rate(permission_mongo_cache_hits_total[5m])) /
  (sum(rate(permission_mongo_cache_hits_total[5m])) + 
   sum(rate(permission_mongo_cache_misses_total[5m])))

# MongoDB connection pool utilization
mongodb_connection_pool_in_use / mongodb_connection_pool_max

# P99 request latency
histogram_quantile(0.99, 
  rate(permission_mongo_http_request_duration_seconds_bucket[5m]))

Grafana Dashboard

Included dashboard panels:

HTTP Request Rate - Requests/sec by endpoint
Cache Hit Ratio - Policy, hierarchy, schema cache effectiveness
MongoDB Pool Usage - Active connections vs max pool size
RBAC Evaluation Time - Expression compilation and evaluation latency
Audit Log Queue Depth - Async batch writer backlog

Performance Checklist

Enable Redis caching for policies and hierarchy

Tune MongoDB pool size based on workload (100-300 connections)

Configure Redis pool for high concurrency (500-1000 connections)

Enable async audit logging with batching

Set appropriate HTTP timeouts (read/write: 30s)

Monitor cache hit ratios (target >90%)

Create MongoDB indexes on frequently queried fields

Use connection keep-alive (MinPoolSize) for consistent latency

Troubleshooting

High Latency

Symptoms: P99 latency >100ms Solutions:

Check cache hit rate - should be >90%
Verify MongoDB indexes exist on query fields
Increase connection pool sizes
Reduce RBAC expression complexity

Connection Pool Exhaustion

Symptoms: Timeout errors, connection refused Solutions:

# Increase pool limits
mongodb:
  max_pool_size: 300
  
redis:
  pool_size: 1000
  pool_timeout: 5s

Memory Pressure

Symptoms: High memory usage, OOM errors Solutions:

Reduce audit log buffer size
Clear AST cache on policy reload
Limit batch sizes for bulk operations

// Clear AST cache periodically
compiler.ClearCache()

Getting Started

Core Concepts

Configuration

Guides

Monitoring & Observability

Advanced

SDK

Overview

Connection Pooling

MongoDB Connection Pool

Redis Connection Pool

Lock-Free Optimizations

AST Caching with sync.Map

Atomic Operations for Metrics

Async Audit Logging

HTTP Server Tuning

Benchmarks

Throughput Targets

Load Testing

Monitoring Performance

Key Metrics

Grafana Dashboard

Performance Checklist

Troubleshooting

High Latency

Connection Pool Exhaustion

Memory Pressure

Next Steps

Caching Strategy

Expression Language

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Configuration

Guides

Monitoring & Observability

Advanced

SDK

​Overview

​Connection Pooling

​MongoDB Connection Pool

​Redis Connection Pool

​Lock-Free Optimizations

​AST Caching with sync.Map

​Atomic Operations for Metrics

​Async Audit Logging

​HTTP Server Tuning

​Benchmarks

​Throughput Targets

​Load Testing

​Monitoring Performance

​Key Metrics

​Grafana Dashboard

​Performance Checklist

​Troubleshooting

​High Latency

​Connection Pool Exhaustion

​Memory Pressure

​Next Steps

Caching Strategy

Expression Language

Build docs developers (and LLMs) love

Overview

Connection Pooling

MongoDB Connection Pool

Redis Connection Pool

Lock-Free Optimizations

AST Caching with sync.Map

Atomic Operations for Metrics

Async Audit Logging

HTTP Server Tuning

Benchmarks

Throughput Targets

Load Testing

Monitoring Performance

Key Metrics

Grafana Dashboard

Performance Checklist

Troubleshooting

High Latency

Connection Pool Exhaustion

Memory Pressure

Next Steps