Designing a Chat Application

Overview

Chat applications like WhatsApp, Facebook Messenger, and Discord serve billions of messages daily. This case study explores two approaches to building a chat application: a simplified 1-to-1 chat using Redis pub/sub, and a more comprehensive production-grade architecture.

Understanding both simple and complex chat architectures helps you choose the right approach based on scale requirements.

Approach 1: Simple Chat with Redis

Redis-based Chat Application Architecture

A simple chat application can leverage Redis pub/sub functionality for real-time messaging.

Stage 1: Connection Initialization

Let’s walk through how Bob connects to the chat application:

Client Connection

Steps 1-2: Bob opens the chat application. A WebSocket connection is established between the client and the server for bidirectional, real-time communication.

Redis Setup

Steps 3-4: The pub-sub server establishes multiple connections to Redis:

One connection to update Redis data models and publish messages to topics
Multiple connections to subscribe and listen for updates on different topics

Initial Data Load

Steps 5-6: Bob’s client requests:

Chat member list (who’s available)
Historical message list (previous conversations)

This information is retrieved from Redis and sent to the client.

Presence Update

Steps 7-8: Since Bob is a new member joining the chat, a message is published to the member_add topic. Other participants’ clients receive this update and can see Bob is now online.

Stage 2: Message Handling

When Bob sends a message to Alice:

Send Message

Step 1: Bob sends a message to Alice through the WebSocket connection.

Persist & Publish

Step 2: The server performs two operations:

Adds the message to a Redis SortedSet using ZADD (sorted by timestamp)
Publishes the message to the messages topic for subscribers

Receive Message

Step 3: Alice’s client, subscribed to the messages topic, receives the chat message in real-time.

Key Redis Data Structures:

# Store messages in a sorted set (sorted by timestamp)
ZADD chat:room:123 1678901234 '{"user":"Bob","msg":"Hello Alice!"}'

# Publish message to subscribers
PUBLISH messages '{"room":"123","user":"Bob","msg":"Hello Alice!"}'

# Store online members
SADD chat:room:123:members "Bob" "Alice"

Limitations of Redis Approach

While simple, this Redis-based approach has limitations:

No message persistence: Redis pub/sub doesn’t guarantee delivery if clients are offline
Single point of failure: Redis instance becomes a bottleneck
No message history: Pub/sub messages are fire-and-forget
Limited scalability: One Redis instance can only handle so many connections

Approach 2: Production-Grade Architecture

Production Chat Application Architecture

A scalable chat application for millions of users requires a more sophisticated architecture.

Establish Connection

Step 1: Alice logs in and establishes a WebSocket connection with the server. The connection is stateful and persistent.

Update Presence

Steps 2-4:

The presence service receives Alice’s connection notification
Updates Alice’s status to “online” in the presence database
Notifies Alice’s friends about her online status

Messaging Flow

When Alice sends a message to Bob:

Send Message

Steps 1-2: Alice sends a chat message to Bob. The message is routed to Chat Service A (the service instance handling Alice’s connection).

Generate ID & Persist

Steps 3-4:

Message is sent to the Sequencing Service, which generates a globally unique, ordered message ID
Message is persisted in the Message Store (database) for durability and history

Queue for Delivery

Step 5: The message is sent to the Message Sync Queue to be synchronized to Bob’s chat service.

Check Recipient Status

Step 6: The Message Sync Service checks Bob’s presence:If Bob is online:

Message is sent to Chat Service B (handling Bob’s connection)
Delivered via WebSocket in real-time

If Bob is offline:

Message is sent to the Push Notification Server
Push notification sent to Bob’s device

Deliver to Recipient

Steps 7-8: If Bob is online, the message is pushed to Bob’s client through the WebSocket connection.

Key Components Explained

Chat Service (Stateful)

Purpose: Maintain WebSocket connections with clientsCharacteristics:

Each instance handles thousands of concurrent WebSocket connections
Routes incoming messages to appropriate services
Delivers outgoing messages to connected clients
Must be horizontally scalable

Challenges:

Session affinity: Same client should reconnect to same instance
Connection state management
Graceful handling of disconnections

Presence Service

Purpose: Track online/offline status of usersFeatures:

Real-time status updates
Last seen timestamps
User availability (online, away, busy, offline)
Heartbeat mechanism to detect disconnections

Implementation:

# Redis example
HSET user:alice:presence status "online" last_seen "1678901234"
EXPIRE user:alice:presence 300  # Auto-expire if no heartbeat

Sequencing Service

Purpose: Generate globally unique, ordered message IDsWhy needed:

Ensure messages appear in correct order across all clients
Provide unique identifier for each message
Enable efficient message synchronization

Approaches:

Snowflake ID: Timestamp + machine ID + sequence number
Database sequence: Use database auto-increment (limited scalability)
Distributed ID generator: Services like Twitter Snowflake or Instagram’s ID generator

Message Store

Purpose: Persist all messages for history and recoveryRequirements:

High write throughput (millions of messages per second)
Fast retrieval of recent messages
Long-term storage of message history
Support for pagination

Database choices:

Cassandra: High write throughput, good for time-series data
MongoDB: Flexible schema, good query capabilities
HBase: Scalable, column-oriented storage

Message Sync Queue

Purpose: Decouple message sending from deliveryBenefits:

Handle bursts of messages
Retry failed deliveries
Support offline delivery
Enable message ordering guarantees

Technologies: Kafka, RabbitMQ, AWS SQS

Design Tradeoffs

WebSocket vs. HTTP Polling

WebSocket (Chosen):

✅ True real-time, bidirectional communication
✅ Lower latency
✅ Less bandwidth overhead
❌ More complex to scale (stateful connections)
❌ Requires connection state management

HTTP Polling:

✅ Simple to implement
✅ Stateless, easier to scale
❌ Higher latency
❌ Wasteful (polling empty results)

Push vs. Pull for Message Delivery

Push (Chosen):

✅ Instant delivery when recipient is online
✅ Better user experience
❌ Requires maintaining connections

Pull:

✅ Simpler architecture
✅ Client controls polling frequency
❌ Higher latency
❌ Increased server load from constant polling

Synchronous vs. Asynchronous Message Processing

Asynchronous (Chosen):

✅ Better scalability
✅ Handles traffic spikes
✅ Decouples components
❌ More complex architecture
❌ Eventual consistency

Synchronous:

✅ Simpler to reason about
✅ Immediate consistency
❌ Tight coupling
❌ Harder to scale

Scalability Considerations

Horizontal Scaling

Chat Services

Run multiple instances behind a load balancer. Use consistent hashing for session affinity.

Message Store

Shard by user ID or conversation ID. Use replication for high availability.

Message Queue

Partition by conversation ID. Scale consumers independently of producers.

Presence Service

Use distributed cache (Redis Cluster) for high-speed reads/writes.

Handling Group Chats

Group chats introduce additional complexity: Challenges:

Message must be delivered to N recipients (fan-out)
Large groups (thousands of members)
Read receipts and typing indicators

Solutions:

Message Fan-out

When a message is sent to a group:

Persist once in message store
Create N queue entries (one per recipient)
Each recipient’s chat service pulls their messages

Optimize Large Groups

For large groups (>100 members):

Disable read receipts
Batch presence updates
Use message pagination aggressively

Key Technologies

WebSocket

Real-time, bidirectional communication between client and server

Redis

Presence service, caching, pub/sub for simple implementations

Cassandra/HBase

Message storage with high write throughput

Kafka

Message queue for async processing and delivery

Snowflake IDs

Globally unique, time-ordered message identifiers

Push Notification

Deliver messages to offline users (APNs, FCM)

Summary

Building a chat application requires careful consideration of:

Real-time Communication

Use WebSockets for persistent, bidirectional connections

Message Persistence

Store messages durably with globally unique, ordered IDs

Presence Management

Track user online/offline status in real-time

Async Processing

Use message queues to decouple and scale message delivery

Offline Support

Integrate push notifications for offline users

Start simple with Redis pub/sub for prototypes or low-scale applications. As you grow, migrate to a distributed architecture with dedicated services for chat, presence, and message delivery.

Tech Giants

System Designs

Overview

Approach 1: Simple Chat with Redis

Stage 1: Connection Initialization

Stage 2: Message Handling

Limitations of Redis Approach

Approach 2: Production-Grade Architecture

Messaging Flow

Key Components Explained

Design Tradeoffs

Scalability Considerations

Horizontal Scaling

Chat Services

Message Store

Message Queue

Presence Service

Handling Group Chats

Key Technologies

WebSocket

Redis

Cassandra/HBase

Kafka

Snowflake IDs

Push Notification

Summary

Build docs developers (and LLMs) love

Tech Giants

System Designs

​Overview

​Approach 1: Simple Chat with Redis

​Stage 1: Connection Initialization

​Stage 2: Message Handling

​Limitations of Redis Approach

​Approach 2: Production-Grade Architecture

​User Login Flow

​Messaging Flow

​Key Components Explained

​Design Tradeoffs

​Scalability Considerations

​Horizontal Scaling

Chat Services

Message Store

Message Queue

Presence Service

​Handling Group Chats

​Key Technologies

WebSocket

Redis

Cassandra/HBase

Kafka

Snowflake IDs

Push Notification

​Summary

Build docs developers (and LLMs) love

Overview

Approach 1: Simple Chat with Redis

Stage 1: Connection Initialization

Stage 2: Message Handling

Limitations of Redis Approach

Approach 2: Production-Grade Architecture

User Login Flow

Messaging Flow

Key Components Explained

Design Tradeoffs

Scalability Considerations

Horizontal Scaling

Handling Group Chats

Key Technologies

Summary