Decentralized Design
Unlike traditional orchestrators like Kubernetes or Docker Swarm, Uncloud has no central control plane. There are no master nodes, no quorum requirements, and no single point of failure to maintain. Instead, Uncloud uses a peer-to-peer architecture where every machine is equal. Each machine maintains a complete copy of the cluster state and can accept commands from the CLI or web interface. This design favors Availability and Partition tolerance (AP) over strict Consistency in the CAP theorem.In case of network partitioning, you can continue to interact with each partition separately to manage services running on them. The system automatically reconciles state when the partition heals.
State Synchronization with Corrosion
Uncloud uses Corrosion, a distributed SQLite database built by Fly.io, to share cluster state between machines. Corrosion uses Conflict-Free Replicated Data Types (CRDTs) to enable eventually consistent state synchronization. Here’s how it works:- Every machine stores the complete cluster state in its local Corrosion database
- Machines can modify their copy independently without coordination
- State changes propagate through the mesh network to other machines
- CRDTs automatically resolve conflicts from concurrent updates
- All machines eventually converge to the same state
Corrosion runs as a systemd service (
uncloud-corrosion) on each machine alongside the main daemon.WireGuard Mesh Networking
Uncloud creates a flat WireGuard overlay network between all machines in your cluster. This mesh network allows containers to communicate directly with each other regardless of which machine they’re running on.Network Topology
The mesh network uses the10.210.0.0/16 address space:
| CIDR | Description |
|---|---|
10.210.0.0/16 | The entire WireGuard mesh network |
10.210.X.0/24 | /24 subnet assigned to machine X |
10.210.X.1/32 | Machine X address (first address from subnet) |
10.210.X.Y/32 | Container Y address running on machine X |
/24 subnet for itself and its containers. The machine takes the first IP (.1), and containers get IPs from .2 to .254.
Automatic Peer Discovery
When you add a new machine to the cluster:- A unique WireGuard key pair is generated
- A new
/24subnet is allocated from the cluster address space - The machine is registered in the shared cluster state
- Existing machines learn about it through state synchronization
- All machines automatically establish WireGuard tunnels
Uncloud uses NAT traversal techniques inspired by Talos KubeSpan to establish connections even when machines are behind firewalls.
Keepalive and Connection Maintenance
WireGuard tunnels use a 25-second keepalive interval to maintain connections through firewalls and NAT devices. This interval works reliably with most network infrastructure while minimizing bandwidth overhead.Core Components
Each machine in an Uncloud cluster runs these components:uncloudd Daemon
The main daemon (uncloudd) runs on every machine as a systemd service. It handles:
- Container lifecycle management through Docker
- WireGuard mesh configuration
- gRPC API for CLI and machine-to-machine communication
- Request routing between machines using grpc-proxy
You can check daemon logs with:
journalctl -u uncloud -fCorrosion Database
The distributed state store runs as a separate systemd service (uncloud-corrosion). It provides:
- CRDT-based SQLite database for cluster state
- Automatic state replication between machines
- Eventual consistency guarantees
- Conflict-free concurrent updates
You can check Corrosion logs with:
journalctl -u uncloud-corrosion -fCaddy Reverse Proxy
Caddy runs as a container in global mode on machines with the appropriate role. It provides:- Automatic HTTPS with Let’s Encrypt
- Reverse proxy for HTTP/HTTPS services
- Dynamic configuration updates from cluster state
- Health check integration
DNS Server
An embedded DNS server runs inside eachuncloudd daemon to provide service discovery. It:
- Resolves service names to container IPs
- Forwards external queries to upstream DNS servers
- Updates records automatically when containers start/stop
- Provides multiple resolution modes (round-robin, nearest)
Communication Flow
When you run a command likeuc ls to list services:
- The CLI connects to any machine via SSH
- It makes a gRPC call to that machine’s
uncloudddaemon - The daemon queries its local Corrosion database for cluster state
- Results are returned to the CLI
- CLI sends the service spec to the connected machine
- The daemon uses
grpc-proxyto route container creation to target machines - Each target machine starts containers and updates cluster state
- State changes propagate through Corrosion to all machines
- DNS servers and Caddy instances update automatically
You can connect to any machine in the cluster - they all have complete state and can execute any operation.
Resource Footprint
Uncloud is designed to be lightweight:uncloudddaemon: ~30-50 MB RAMuncloud-corrosion: ~20-30 MB RAM- Caddy container: ~40-60 MB RAM
- Total overhead: ~150 MB RAM per machine
Further Reading
Machines
Learn about machine lifecycle and subnet allocation
Services
Understand service modes and container orchestration
Networking
Deep dive into WireGuard mesh and IP addressing
Design Document
Read the original design philosophy and goals
