Core components
Infrahub server
The Infrahub server is the main API server that handles:- GraphQL API for data queries and mutations
- REST API for configuration and management
- WebSocket connections for real-time updates
- Schema management and validation
- Authentication and authorization
- Git repository integration
- Port: 8000
- Workers: 4 (configurable via
WEB_CONCURRENCY) - Protocol: HTTP/HTTPS
Task workers
Task workers execute background operations asynchronously:- Schema migrations and updates
- Git repository synchronization
- Data validation and transformations
- Artifact generation
- Long-running queries and operations
- Worker type:
infrahubasync - Replicas: 2
- Polling interval: 2 seconds
Neo4j database
Neo4j is the graph database that stores:- Object data (nodes, relationships, attributes)
- Schema definitions
- Branch and version history
- Temporal data across branches
- Protocol: bolt
- Port: 7687
- HTTP port: 7474
- Database type: Neo4j 2025.10.1
Message queue (RabbitMQ)
RabbitMQ provides asynchronous messaging between components:- Event notifications
- Task distribution
- Inter-service communication
- Webhook triggers
- Port: 5672
- Management UI: 15692
- Virtual host:
/ - Driver: RabbitMQ
INFRAHUB_BROKER_DRIVER=nats.
Cache (Redis)
Redis provides distributed caching and locking:- Query result caching
- Session storage
- Distributed locks for concurrency control
- Temporary data storage
- Port: 6379
- Database: 0
- Driver: Redis
INFRAHUB_CACHE_DRIVER=nats.
Task manager (Prefect)
Prefect manages workflow orchestration:- Task scheduling and execution
- Flow run tracking
- Worker pool management
- Execution history and logs
- Prefect server: API and UI (port 4200)
- PostgreSQL database: Flow run state and logs
- Background workers: Task execution and cleanup
- API port: 4200
- Database: PostgreSQL 18
- Worker type:
infrahubasync
Object storage
Object storage persists artifacts and files:- Generated artifacts (configurations, scripts)
- Uploaded files
- Git repository clones
- Export data
- Local filesystem (default):
/opt/infrahub/storage - S3-compatible storage: AWS S3, MinIO, etc.
Deployment architectures
Single-node deployment
Suitable for development, testing, and small production deployments:- CPU: 4 cores minimum
- RAM: 8 GB minimum
- Storage: 50 GB minimum
High availability deployment
Suitable for production environments requiring resilience:- Multiple API server replicas with load balancing
- Multiple task worker replicas
- Neo4j cluster with 3+ nodes (Enterprise only)
- RabbitMQ cluster with quorum queues
- Redis Sentinel for cache failover
- PostgreSQL replication for Prefect state
- S3-compatible object storage for shared artifacts
- Small (16 GB RAM): 4 API workers, 2 task workers
- Medium (32 GB RAM): 4 API workers, 4 task workers
- Large (64 GB RAM): 4 API workers, 8 task workers
Network architecture
Infrahub components communicate over the following ports:| Component | Port | Protocol | Purpose |
|---|---|---|---|
| Infrahub Server | 8000 | HTTP/HTTPS | API and UI |
| Neo4j | 7687 | Bolt | Database queries |
| Neo4j | 7474 | HTTP | Admin interface |
| RabbitMQ | 5672 | AMQP | Message queue |
| RabbitMQ | 15692 | HTTP | Management UI |
| Redis | 6379 | Redis | Cache |
| Prefect | 4200 | HTTP | Task manager API |
| PostgreSQL | 5432 | PostgreSQL | Prefect database |
- External access: Port 8000 (Infrahub API)
- Internal network: All component ports
- Database ports should not be exposed publicly
Data flow
GraphQL query execution
- Client sends GraphQL query to Infrahub server
- Server authenticates and authorizes request
- Server queries Redis cache for cached results
- If not cached, server queries Neo4j database
- Server processes and transforms results
- Server caches results in Redis
- Server returns response to client
Background task execution
- Server creates task definition
- Server sends task to Prefect via API
- Prefect schedules task in worker pool
- Task worker polls Prefect for available tasks
- Worker executes task logic
- Worker updates task state in Prefect
- Worker sends results to RabbitMQ message queue
- Server receives notification and processes results
Git repository synchronization
- User triggers repository sync via API
- Server creates background task
- Task worker clones/pulls repository
- Worker validates repository contents
- Worker imports schemas, checks, and generators
- Worker updates Neo4j database
- Worker stores repository in object storage
- Worker sends completion notification
Scaling considerations
Horizontal scaling
API servers:- Add replicas to handle increased API traffic
- Configure load balancer with sticky sessions for WebSocket
- Set
INFRAHUB_STORAGE_DRIVER=s3when running multiple replicas
- Add replicas to handle increased background workload
- Workers automatically register with Prefect pool
- Set
INFRAHUB_WORKFLOW_WORKER_POLLING_INTERVALto balance load
- Neo4j Community: Single node only
- Neo4j Enterprise: Cluster with 3+ nodes for high availability
- Add read replicas for read-heavy workloads
Vertical scaling
API servers:- Increase CPU for faster query processing
- Increase RAM for larger in-memory caches
- Increase
WEB_CONCURRENCYfor more Gunicorn workers
- Increase CPU for faster task execution
- Increase RAM for larger datasets
- Increase
INFRAHUB_BROKER_MAXIMUM_CONCURRENT_MESSAGESfor parallelism
- Increase RAM for larger page cache (see
NEO4J_dbms_memory_pagecache_size) - Increase heap size for query execution (see
NEO4J_dbms_memory_heap_max__size) - Add SSD storage for better I/O performance
Related resources
- Installation guide - Deploy Infrahub using Docker Compose or Kubernetes
- Configuration reference - Complete list of environment variables
- Backup and restore - Protect your data with backups
- Monitoring - Monitor Infrahub health and performance