Infrastructure overview
Ably’s platform is built primarily on Amazon’s cloud infrastructure. The platform is distributed across more than 15 physical datacenters within the AWS network, with 700+ edge locations globally through AWS CloudFront, ensuring there isn’t a single point of failure or congestion across the service.Key infrastructure characteristics
Multi-region deployment
Servers distributed across 15+ physical datacenters within the AWS network
Global edge network
700+ edge locations globally through AWS CloudFront
Physical isolation
Each datacenter is physically isolated from others to prevent cascading failures
Independent scaling
Each datacenter scales independently to meet regional load
Datacenter distribution
Ably operates in multiple regions around the world, with each datacenter operating independently. This global distribution provides several benefits:Geographic proximity
Clients are automatically connected to the nearest datacenter to reduce latency. This ensures optimal performance regardless of user location.Fault isolation
Each datacenter is physically isolated from the others, ensuring that a failure in one datacenter has no effect on any other datacenter. This isolation is critical for maintaining service availability during regional outages.Data residency
The global distribution enables Ably to comply with data residency requirements by keeping data within specific geographic regions when required. This is important for customers operating in regulated industries or jurisdictions with strict data sovereignty laws.Intelligent routing
Ably is designed to route messages using the least amount of network hops to minimize latency and maximize performance for clients, regardless of their location.DNS-based routing
Ably uses DNS-based latency routing to direct clients to the nearest available datacenter. When a client performs a DNS lookup, the DNS service resolves to the closest datacenter to the client’s location. Primary endpoint:main.realtime.ably.net
Ably’s DNS configuration uses a TTL of 60 seconds, allowing for relatively quick rerouting of traffic if a datacenter becomes unhealthy.
Fallback mechanisms
To address DNS limitations in failure scenarios, Ably implements a fallback mechanism in all client libraries:- If a client cannot connect to the primary endpoint, it automatically attempts to connect using alternative endpoints
- Fallback endpoints include direct connections to specific datacenters
- A completely segregated secondary domain (
ably-realtime.com) uses a different DNS provider
CloudFront and load balancing
AWS CloudFront
Client connections to Ably are handled through AWS CloudFront for global edge distribution. When a client attempts to connect to Ably, the request is first routed to the nearest CloudFront edge location with over 700 edge locations globally. This reduces the public internet transit time, as clients connect to a nearby edge node rather than traversing the entire distance to an Ably datacenter.Network Load Balancers
Behind CloudFront, each Ably region employs AWS Network Load Balancers (NLBs) to distribute traffic to the application servers. NLBs:- Operate at the transport layer
- Handle millions of requests per second
- Maintain ultra-low latencies
- Distribute traffic to frontend servers for establishing and maintaining client connections
Auto-healing and auto-scaling
Dynamic load assignment
Load is dynamically assigned and reassigned across servers in realtime. The service auto-heals and routes around network failures automatically.Independent regional scaling
Each datacenter scales independently to meet the load within that region. Ably continuously monitors CPU, memory, and other key metrics, triggering autoscaling based on aggregated performance indicators.Capacity management
All Ably infrastructure scales on demand to handle ambient traffic levels. The infrastructure is typically provisioned with significant headroom above current demand, ensuring that sudden increases in traffic can be accommodated without impacting service quality.Infrastructure redundancy
Beyond DNS and client-side fallbacks, Ably’s infrastructure includes multiple layers of redundancy:Datacenter redundancy
Each datacenter contains redundant servers, network paths, and storage systems to eliminate single points of failure.
Multi-region redundancy
The failure of an entire datacenter does not impact the availability of the service as a whole. Clients can continue to connect via other datacenters.
Edge redundancy
CloudFront is designed to be highly available, with redundant capacity across multiple edge locations.
Message persistence
Messages are persisted in multiple locations:
- Every message is stored in RAM on two or more physically isolated datacenters within the receiving region
- Every message is additionally stored in RAM in at least one other region
- For persisted messages, storage across three regions is required before the message is deemed successfully stored
Network information
For detailed information about Ably’s global infrastructure:- Network map: View Ably’s datacenters and global points of presence
- Status monitoring: Check the status of datacenters by region
- Latency metrics: See global round-trip latency statistics measured externally by Uptrends
Next steps
Edge network
Learn about Ably’s edge network architecture and DDoS protection
Fault tolerance
Understand how Ably maintains high availability
Performance
Explore Ably’s performance characteristics
Scalability
Discover how Ably achieves unlimited scalability
