Architecture Overview
An Infinitic deployment consists of:- Pulsar Cluster - Message transport layer
- Storage Backend - Persistent state storage (Redis/PostgreSQL/MySQL)
- Infinitic Workers - Execute tasks and workflows
- Client Applications - Trigger workflows and tasks
Deployment Patterns
Single-Tenant Architecture
All components in one namespace/environment:- Single application or team
- Simplified operations
- Lower infrastructure costs
Multi-Tenant Architecture
Separate namespaces per tenant/environment:- Multiple independent applications
- Different teams or business units
- Isolation requirements
- Different SLAs per tenant
Configuration Management
Environment Variables
Use environment variables for secrets and environment-specific values:Secrets Management
Integrate with secrets managers:Multi-Environment Configuration
Organize configurations by environment:Docker Deployment
Dockerfile
Docker Compose
Kubernetes Deployment
Worker Deployment
Horizontal Pod Autoscaling
ConfigMap
Scaling Strategies
Vertical Scaling
Increase resources per worker:- CPU-intensive tasks
- Memory-intensive workflows
- Simple scaling approach
Horizontal Scaling
Increase number of worker instances:- High task throughput
- Better fault tolerance
- Easier rollouts/rollbacks
Task-Specific Workers
Deploy specialized workers for different task types:Monitoring and Observability
Metrics
Expose and collect these key metrics: Worker Metrics:- Active task executions
- Task execution duration (p50, p95, p99)
- Task success/failure rate
- Queue depth/backlog
- Worker CPU/memory usage
- Pulsar message rate
- Pulsar consumer lag
- Storage latency
- Storage connection pool usage
Logging
Structured logging configuration:Health Checks
Implement health check endpoints:High Availability
Worker Redundancy
Deploy multiple workers across availability zones:Infrastructure HA
Pulsar:- Deploy multi-node Pulsar cluster
- Configure BookKeeper with replication
- Use ZooKeeper for coordination
- Use Redis Sentinel or Redis Cluster
- Configure automatic failover
- Set up replication across AZs
- Configure streaming replication
- Set up automatic failover (e.g., Patroni for PostgreSQL)
- Use connection pooling (PgBouncer, ProxySQL)
Disaster Recovery
Backup Strategy
Pulsar:Recovery Procedures
- Restore infrastructure - Bring up Pulsar and storage
- Restore state - Load backup data into storage
- Deploy workers - Start worker deployments
- Verify health - Check all health endpoints
- Resume operations - Enable traffic to client applications
Security Best Practices
Network Security
- Deploy in private subnets
- Use security groups/network policies
- Enable TLS for all connections
- Implement network segmentation
Access Control
Secrets Management
- Never commit secrets to version control
- Rotate secrets regularly
- Use secrets managers (Vault, AWS Secrets Manager)
- Limit secret access to necessary services only
Troubleshooting
Worker Not Starting
Check logs:- Invalid configuration syntax
- Unable to connect to Pulsar/storage
- Missing authentication credentials
- Insufficient resources
High Latency
Investigate:- Storage backend performance
- Network latency between components
- Worker resource constraints
- Pulsar message backlog
- Scale workers horizontally
- Optimize task implementations
- Increase connection pools
- Enable caching
Message Backlog
Check backlog:- Increase worker count
- Optimize slow tasks
- Check for stuck workflows
- Review error rates
Production Checklist
- TLS enabled for all connections
- Authentication configured
- Secrets stored securely
- High availability configured
- Backups automated and tested
- Monitoring and alerting set up
- Health checks implemented
- Resource limits defined
- Autoscaling configured
- Disaster recovery plan documented
- Network policies implemented
- Logging centralized
- Performance tested under load
- Rollback procedure tested
Next Steps
- Pulsar Transport - Deep dive into Pulsar configuration
- Storage Backends - Configure storage options