Network connectivity problems
Machines can’t connect to each other
Symptoms: Containers on different machines can’t communicate, WireGuard tunnels not established Diagnosis:-
Check firewall rules
-
Verify WireGuard endpoints
Check that the WIREGUARD ENDPOINTS column shows reachable addresses
-
Update machine endpoints
If a machine’s IP changed:
-
Check WireGuard interface
The interface should have an IP like
10.210.X.1/24 -
Restart Uncloud daemon
Understanding WireGuard connectivity
Understanding WireGuard connectivity
Uncloud creates a WireGuard mesh where each machine connects to every other machine. If Machine A can’t reach Machine B:If pings fail, check:
- Machine A needs to know Machine B’s endpoint (IP:port)
- Machine B’s firewall must allow UDP port 51820
- NAT routers must allow UDP hole punching (most do)
- Firewall rules on both machines
- NAT configuration
- WireGuard logs:
sudo journalctl -u uncloud -f
Containers can’t resolve service names
Symptoms: DNS resolution fails inside containers,curl http://service.internal doesn’t work
Diagnosis:
-
Verify the service exists
-
Check internal DNS server
The container’s
/etc/resolv.confshould list the machine’s WireGuard IP: -
Restart the container
-
Check Uncloud daemon logs
Containers can’t reach the internet
Symptoms:curl https://google.com fails inside containers
Diagnosis:
-
Check NAT/masquerading
You should see a MASQUERADE rule for the Uncloud network
-
Verify Docker network
Check that
EnableIPMasqueradeis true -
Check DNS forwarding
If only the Uncloud DNS server is listed, it should forward external queries
Service deployment failures
Deployment hangs or times out
Symptoms:uc deploy or uc run never completes
Diagnosis:
-
Image pull failures
If the image is private or large:
Solution: Use Unregistry for faster local image distribution:
-
Container crashes on startup
Look for error messages in the logs
-
Resource constraints
Check if the machine has enough memory or disk space
-
Machine unreachable
If a machine shows “Down”, try:
Port already in use
Symptoms: Error like “bind: address already in use” Diagnosis:-
Stop conflicting services
-
Use different ports
Instead of port 80, use a different port:
-
Remove old containers
Replicas not spreading across machines
Symptoms: All replicas scheduled on one machine Diagnosis:-
Check machine availability
Ensure machines are in “Up” state
-
Use placement constraints
- Check machine resources Uncloud’s scheduler prefers machines with more available resources
Certificate issues
Let’s Encrypt certificate not obtained
Symptoms: HTTPS doesn’t work, browser shows “Not Secure” warning Diagnosis:-
Verify DNS resolution
DNS must point to your machine’s public IP
-
Check port 80 accessibility
Let’s Encrypt uses HTTP-01 challenge on port 80:
-
Check firewall rules
-
Wait for DNS propagation
DNS changes can take up to 48 hours. Check with:
-
Check rate limits
Let’s Encrypt has rate limits. If exceeded:
Use the staging environment in your Caddyfile:
Certificate expired
Symptoms: Browser shows “Your connection is not private” error Diagnosis:-
Force renewal
-
Check Caddy logs
-
Delete old certificate
Then restart Caddy
Cluster state issues
Machine shows as “Down” but it’s running
Symptoms:uc machine ls shows a machine as Down, but you can SSH into it
Diagnosis:
-
Restart services
-
Check Corrosion state
Look for replication errors
-
Verify cluster connectivity
Check that WireGuard peers are connected
Services not showing up after deployment
Symptoms:uc service ls doesn’t show a newly deployed service
Diagnosis:
-
Wait for state propagation
The cluster uses eventual consistency. Wait 10-30 seconds and retry:
-
Check daemon logs
-
Verify Corrosion is running
Debug commands
Useful commands for debugging issues:Network debugging
Service debugging
Machine debugging
Caddy debugging
Where to get help
GitHub Issues
Report bugs and request features: GitHub: github.com/psviderski/uncloud/issues When opening an issue, include:- Uncloud version (
uc version) - Machine OS and version
- Complete error messages
- Steps to reproduce
- Relevant logs
Discord Community
Join the Uncloud Discord for:- Quick questions
- General discussions
- Community support
- Feature ideas
GitHub Discussions
For longer-form discussions: Discussions: github.com/psviderski/uncloud/discussions Good for:- How-to questions
- Architecture discussions
- Sharing setups and configurations
- Feature proposals
Common error messages
Error: bind: address already in use
Error: bind: address already in use
Cause: Another process is using the portSolution:
Error: failed to connect to machine
Error: failed to connect to machine
Cause: SSH connection failedSolutions:
-
Check SSH access manually:
-
Verify SSH key:
-
Check firewall rules:
Error: machine not found
Error: machine not found
Cause: Machine doesn’t exist in the clusterSolution:
Error: service not found
Error: service not found
Cause: Service doesn’t exist or hasn’t propagated yetSolutions:
-
List all services:
- Wait for state propagation (10-30 seconds)
- Check service name spelling
Error: no internet-reachable machines
Error: no internet-reachable machines
Cause: No machines have public IPs or are behind firewallsSolutions:
-
Set public IP on a machine:
-
Open firewall ports (80, 443):
- Configure port forwarding if behind NAT
Error: rate limit exceeded
Error: rate limit exceeded
Cause: Hit Let’s Encrypt rate limit (50 certs/week)Solutions:
-
Use staging environment:
- Wait for quota reset (1 week)
- Use DNS challenge instead of HTTP challenge (requires custom Caddyfile)
Next steps
Machine Management
Learn about machine operations
Monitoring
Set up monitoring and logging
