Microservices architecture
The Distributed Notification System is built on a microservices architecture with 5 independent services that communicate through REST APIs and message queues. Each service is containerized, independently scalable, and maintains its own data store.Service responsibilities
API Gateway (Port 8000)
The API Gateway is the single entry point for all client requests. It orchestrates the notification workflow by coordinating with other services. Key responsibilities:- Receives and validates notification requests from clients
- Authenticates incoming requests
- Fetches user information from User Service (REST)
- Retrieves template content from Template Service (REST)
- Writes initial
pendingstatus to PostgreSQL shared store - Publishes notification messages to appropriate RabbitMQ queues
- Provides status endpoints for clients to query notification delivery status
- Logs all requests with correlation IDs for traceability
400: Invalid payload401: Unauthorized requests500: Internal server errors
User Service (Port 8001)
Manages all user-related data including contact information and notification preferences. Key responsibilities:- Store and retrieve user information (name, email, push tokens)
- Manage notification preferences (email enabled, push enabled)
- Maintain PostgreSQL database
users_db - Cache user preferences in Redis for fast lookups
- Expose REST API for user CRUD operations
Template Service (Port 8002)
Centralized template management with support for versioning and variable substitution. Key responsibilities:- Store notification templates in PostgreSQL
- Support variable substitution (e.g.,
{{name}},{{link}}) - Maintain template version history
- Support multiple languages/locales
- Cache frequently accessed templates in Redis
- Expose REST API for template retrieval
Email Service (Port 8003)
Background worker that consumes email notification requests from RabbitMQ. Key responsibilities:- Consume messages from
email.queuein RabbitMQ - Check user preferences via Redis cache (skip if email disabled)
- Verify notification status is
pendingin PostgreSQL - Send emails via SMTP or API (SendGrid, Mailgun, Gmail)
- Update notification status to
deliveredorfailed - Implement retry logic with exponential backoff
- Move permanently failed messages to
failed.queue(dead letter queue)
- Host: MailHog (for testing) or external SMTP provider
- Port: 1025 (MailHog) or 587/465 (production)
- From address:
[email protected]
Push Service (Port 8004)
Background worker that consumes push notification requests from RabbitMQ. Key responsibilities:- Consume messages from
push.queuein RabbitMQ - Check user preferences via Redis cache (skip if push disabled)
- Verify notification status is
pendingin PostgreSQL - Validate push tokens before sending
- Send push notifications via FCM, OneSignal, or Web Push
- Update notification status to
deliveredorfailed - Implement retry logic with exponential backoff
- Handle invalid device tokens gracefully
Communication patterns
Synchronous communication (REST)
Used for real-time data retrieval and queries where immediate response is required. Use cases:- API Gateway → User Service: Fetch user data and preferences
- API Gateway → Template Service: Retrieve template content
- Client → API Gateway: Query notification status
- Immediate response
- Simple request/response pattern
- Easy to debug and monitor
Asynchronous communication (RabbitMQ)
Used for notification delivery to decouple request acceptance from processing. Use cases:- API Gateway → Email/Push Services: Deliver notifications
- Retry handling for failed deliveries
- Status updates after notification sent
- High throughput
- Fault tolerance with retries
- Scalability (workers can be added/removed)
- Prevents request timeouts for slow operations
RabbitMQ queue structure
The system uses a direct exchange pattern for routing messages to specific queues. Exchange name:notifications.direct
Queues:
| Queue Name | Consumer | Purpose |
|---|---|---|
email.queue | Email Service | Email notification requests |
push.queue | Push Service | Push notification requests |
failed.queue | Dead Letter Queue | Permanently failed messages |
- API Gateway publishes message to
notifications.directexchange - Message is routed to
email.queueorpush.queuebased onnotification_type - Worker services consume messages from their respective queues
- Failed messages (after max retries) are moved to
failed.queue
- Initial retry: 1 second delay
- Exponential backoff: 1s → 2s → 4s → 8s → 16s
- Max retries: 5 attempts
- After max retries: Move to
failed.queue
RabbitMQ management UI is available at
http://localhost:15673 (default credentials: guest/guest)Data storage strategy
Each service maintains its own database following the microservices pattern of data ownership.PostgreSQL databases
| Service | Database Name | Schema |
|---|---|---|
| User Service | users_db | Users table with preferences |
| Template Service | templates_db | Templates with version history |
| Shared Store | notification_db | Notification status tracking |
Redis caching
Current usage:- User preferences cache (key:
user:preferences:{user_id}) - TTL: 1 hour
- Updates on user preference changes
- Notification status caching for fast reads
- Rate limiting counters
- Template caching
Notification lifecycle flow
Client submits notification request
Client sends POST request to API Gateway at
/api/v1/notifications/ with notification details (type, user_id, template, variables).API Gateway orchestrates data collection
Gateway makes parallel REST calls to:
- User Service: Fetch user email/push token and preferences
- Template Service: Retrieve template content and required variables
Status written to shared store
Gateway writes notification record with status
pending to PostgreSQL, using the request_id for idempotency.Message published to RabbitMQ
Gateway publishes message to
notifications.direct exchange, routed to email.queue or push.queue based on notification type.Worker consumes message
Email or Push Service consumes message from queue and performs validation:
- Check user preferences in Redis (skip if disabled)
- Verify status is
pendingin PostgreSQL (prevent duplicates)
Notification sent
Worker sends notification via:
- Email Service: SMTP/API (SendGrid, Mailgun)
- Push Service: FCM, OneSignal, Web Push
Ports summary
| Service | Port(s) | Description |
|---|---|---|
| API Gateway | 8000 | HTTP server for client requests |
| User Service | 8001 | HTTP server for user management |
| Template Service | 8002 | HTTP server for template retrieval |
| Email Service | 8003 | Internal worker (no external access) |
| Push Service | 8004 | Internal worker (no external access) |
| RabbitMQ | 5672, 15672 | AMQP port, Management UI |
| PostgreSQL | 5432 | Database server |
| Redis | 6379 | Cache server |
| MailHog | 1025, 8025 | SMTP server (testing), Web UI |
Key design concepts
Idempotency
Every notification request includes a uniquerequest_id. Before processing, workers check the PostgreSQL shared store:
- If status exists and is not
pending: Skip (already processed) - If status is
pending: Process normally - If no status found: Log error (should have been created by Gateway)
Circuit breaker
Worker services implement circuit breaker pattern for external dependencies (SMTP, FCM):- Closed: Normal operation
- Open: Too many failures, stop attempting and fail fast
- Half-Open: Periodically test if service recovered
Health checks
All services expose/health endpoint:
Correlation IDs
Every request is assigned a correlation ID that flows through all services:- Logged at each step (Gateway → User Service → Template Service → Worker)
- Enables full lifecycle tracking in logs
- Simplifies debugging across distributed services
Monitoring and observability
Metrics to track:- Queue message rates (messages/second)
- Service response times (p50, p95, p99)
- Error rates by service
- Queue lengths (detect backlog)
- Cache hit/miss ratios
- Correlation IDs for request tracing
- Timestamps for latency analysis
- Error messages with stack traces
- User IDs for debugging specific issues
Scalability
Each service can scale independently:- API Gateway: Add more containers behind load balancer
- User Service: Read replicas for PostgreSQL, multiple service instances
- Template Service: Heavy caching, read-only replicas
- Email/Push Services: Add more worker containers to consume queue faster
- RabbitMQ: Cluster mode for high availability
- Redis: Sentinel or Cluster mode
- PostgreSQL: Primary-replica setup with connection pooling
Docker Compose is suitable for local development and testing. For production, consider Kubernetes or Docker Swarm for orchestration and auto-scaling.