Core Concepts
Service Registry Architecture
A service registry maintains a catalog of available services and their instances: Key Information Stored:- IP addresses and ports
- Service names and versions
- Health status
- Metadata (region, tags, capabilities)
Implementation Comparison
- Eureka
- ZooKeeper
- Nacos
Spring Cloud Eureka
Origin: Netflix (now maintained by Spring)CAP Model: AP (Availability + Partition Tolerance)Architecture: Client-Server with peer-to-peer replicationServer Capabilities
Service Registration
Stores service metadata in a unified registry. Clients register on startup.
Registry Table
Provides service lists to clients. Clients cache locally and refresh every 30 seconds.
Service Eviction
Removes instances that haven’t sent heartbeats for 90 seconds (if not in self-preservation).
Self-Preservation
Protects registry during network instability. Doesn’t remove instances if 85% fail heartbeats within 15 minutes.
Client Operations
| Operation | Interval | Description |
|---|---|---|
| Register | On startup | Sends service info (IP, port, metadata) |
| Renew (Heartbeat) | Every 30s | HTTP request to confirm health |
| Fetch Registry | Every 30s | Updates local service cache |
| Cancel | On shutdown | Gracefully deregisters service |
Workflow
Health monitoring
Server checks if heartbeat missing for 90s:
- If less than 85% healthy in 15min → Self-preservation mode
- If 85% or more healthy → Evict unhealthy instance
High Availability Cluster
Cluster Architecture: Peer-to-peer replication
- No master/slave distinction
- All nodes are equal
- Asynchronous data replication
- Eventually consistent (AP model)
- Clients continue working with cached service lists
- Other nodes handle incoming requests
- Failed node replicates latest data on recovery
Decision Matrix
When to Choose Eureka
When to Choose Eureka
✅ Best for:
- Spring Cloud microservices
- Small to medium service counts (less than 10,000 instances)
- Need for high availability
- Simple setup requirements
- Need strong consistency guarantees
- Massive scale (>10,000 instances)
- Require configuration management
When to Choose ZooKeeper
When to Choose ZooKeeper
✅ Best for:
- Distributed coordination (leader election, locks)
- Configuration management
- Hadoop/HBase ecosystems
- Strong consistency requirements
- Availability is critical
- Cannot tolerate 30-120s downtime during elections
- Primary use case is service discovery
When to Choose Nacos
When to Choose Nacos
✅ Best for:
- Large-scale deployments (10,000+ instances)
- Need both service discovery AND config management
- Want flexible CAP model (switch between AP/CP)
- Kubernetes environments
- Dubbo or Spring Cloud ecosystems
- Team unfamiliar with Alibaba ecosystem
- Simple use cases (overhead not justified)
When to Choose Consul
When to Choose Consul
✅ Best for:
- Service Mesh architectures
- Multi-datacenter deployments
- Need for built-in health checks
- HashiCorp ecosystem integration
- Using Java exclusively (Go-based, harder debugging)
- Team lacks Go language expertise
Comparison Table
| Feature | Eureka | ZooKeeper | Nacos | Consul |
|---|---|---|---|---|
| CAP Model | AP | CP | AP & CP | CP |
| Language | Java | Java/C | Java | Go |
| Health Check | Client heartbeat | Socket keep-alive | HTTP heartbeat | Multiple options |
| Watch Support | Long polling | Push | Push/Pull | Long polling |
| Scale Limit | ~10K instances | Medium | 100K+ instances | Large |
| UI Dashboard | Basic | None | Rich | Rich |
| Spring Cloud | Native | Supported | Supported | Supported |
| Config Center | No | No | Yes | Yes |
| K8s Integration | Limited | Limited | Excellent | Excellent |
| Operational Complexity | Low | Medium | Medium | Medium-High |
Implementation Example
- Eureka Client
- Nacos Client
Design Considerations
Client-Side Caching
Why it matters:
- Reduces registry load
- Improves performance
- Provides fallback during registry outage
- Cache service lists locally
- Refresh periodically (30s typical)
- Handle cache invalidation properly
Network Partitions
Scenarios to handle:
- Registry server unreachable
- Service instance unreachable
- Split-brain in clustered registries
- Client-side circuit breakers
- Retry with exponential backoff
- Use health checks actively
Multi-Datacenter
Considerations:
- Cross-region latency
- Data consistency across DCs
- Failover strategies
- Region-aware load balancing
- Prefer local services
- Registry per datacenter
Security
Protect your registry:
- Enable authentication/authorization
- Use TLS for communication
- Implement rate limiting
- Network segmentation
Related Topics
Load Balancing
Learn how to distribute traffic across discovered services
Distributed Systems
Understand broader distributed systems concepts