Skip to main content
This guide covers common issues you might encounter with Uncloud and how to resolve them.

Network connectivity problems

Machines can’t connect to each other

Symptoms: Containers on different machines can’t communicate, WireGuard tunnels not established Diagnosis:
# Check machine status
uc machine ls

# SSH into a machine and check WireGuard
ssh [email protected]
sudo wg show
Solutions:
  1. Check firewall rules
    # Ensure UDP port 51820 is open
    sudo ufw status
    sudo ufw allow 51820/udp
    
  2. Verify WireGuard endpoints
    uc machine ls
    
    Check that the WIREGUARD ENDPOINTS column shows reachable addresses
  3. Update machine endpoints If a machine’s IP changed:
    uc machine update machine1 --endpoint NEW_IP:51820
    
  4. Check WireGuard interface
    ssh [email protected]
    ip addr show wg0
    
    The interface should have an IP like 10.210.X.1/24
  5. Restart Uncloud daemon
    sudo systemctl restart uncloud
    
Uncloud creates a WireGuard mesh where each machine connects to every other machine. If Machine A can’t reach Machine B:
  1. Machine A needs to know Machine B’s endpoint (IP:port)
  2. Machine B’s firewall must allow UDP port 51820
  3. NAT routers must allow UDP hole punching (most do)
Check connectivity:
# From Machine A, ping Machine B's WireGuard IP
ping 10.210.1.1
If pings fail, check:
  • Firewall rules on both machines
  • NAT configuration
  • WireGuard logs: sudo journalctl -u uncloud -f

Containers can’t resolve service names

Symptoms: DNS resolution fails inside containers, curl http://service.internal doesn’t work Diagnosis:
# Check DNS configuration
uc service exec myservice cat /etc/resolv.conf

# Test DNS resolution
uc service exec myservice nslookup web-api.internal
Solutions:
  1. Verify the service exists
    uc service ls
    
  2. Check internal DNS server The container’s /etc/resolv.conf should list the machine’s WireGuard IP:
    nameserver 10.210.X.1
    
  3. Restart the container
    uc service scale myservice 0
    uc service scale myservice 1
    
  4. Check Uncloud daemon logs
    ssh [email protected]
    sudo journalctl -u uncloud | grep -i dns
    

Containers can’t reach the internet

Symptoms: curl https://google.com fails inside containers Diagnosis:
# Test from inside a container
uc service exec myservice ping 8.8.8.8
uc service exec myservice curl https://google.com
Solutions:
  1. Check NAT/masquerading
    ssh [email protected]
    sudo iptables -t nat -L POSTROUTING -v -n
    
    You should see a MASQUERADE rule for the Uncloud network
  2. Verify Docker network
    docker network inspect uncloud
    
    Check that EnableIPMasquerade is true
  3. Check DNS forwarding
    uc service exec myservice cat /etc/resolv.conf
    
    If only the Uncloud DNS server is listed, it should forward external queries

Service deployment failures

Deployment hangs or times out

Symptoms: uc deploy or uc run never completes Diagnosis:
# Check service status
uc service ls

# Check container state
uc service inspect myservice

# View logs
uc service logs myservice
Solutions:
  1. Image pull failures If the image is private or large:
    # Check Docker logs
    ssh [email protected]
    sudo journalctl -u docker -f
    
    Solution: Use Unregistry for faster local image distribution:
    uc build -t myapp .
    uc push myapp
    uc run myapp
    
  2. Container crashes on startup
    uc service logs myservice
    
    Look for error messages in the logs
  3. Resource constraints
    ssh [email protected]
    free -h
    df -h
    
    Check if the machine has enough memory or disk space
  4. Machine unreachable
    uc machine ls
    
    If a machine shows “Down”, try:
    ssh [email protected] sudo systemctl restart uncloud
    

Port already in use

Symptoms: Error like “bind: address already in use” Diagnosis:
ssh [email protected]
sudo netstat -tulpn | grep :80
sudo netstat -tulpn | grep :443
Solutions:
  1. Stop conflicting services
    # If nginx is running
    sudo systemctl stop nginx
    sudo systemctl disable nginx
    
  2. Use different ports Instead of port 80, use a different port:
    uc run -p 8080:80 myapp
    
  3. Remove old containers
    docker ps -a
    docker rm -f CONTAINER_ID
    

Replicas not spreading across machines

Symptoms: All replicas scheduled on one machine Diagnosis:
uc service inspect myservice
Solutions:
  1. Check machine availability
    uc machine ls
    
    Ensure machines are in “Up” state
  2. Use placement constraints
    services:
      web:
        image: myapp
        deploy:
          mode: replicated
          replicas: 3
          placement:
            machines:
              - machine1
              - machine2
              - machine3
    
  3. Check machine resources Uncloud’s scheduler prefers machines with more available resources

Certificate issues

Let’s Encrypt certificate not obtained

Symptoms: HTTPS doesn’t work, browser shows “Not Secure” warning Diagnosis:
# Check Caddy logs
uc service logs caddy | grep -i acme

# View Caddy config
uc caddy config
Solutions:
  1. Verify DNS resolution
    dig app.example.com
    
    DNS must point to your machine’s public IP
  2. Check port 80 accessibility Let’s Encrypt uses HTTP-01 challenge on port 80:
    curl http://app.example.com/.well-known/acme-challenge/test
    
  3. Check firewall rules
    ssh [email protected]
    sudo ufw status
    sudo ufw allow 80/tcp
    sudo ufw allow 443/tcp
    
  4. Wait for DNS propagation DNS changes can take up to 48 hours. Check with:
    dig app.example.com @8.8.8.8
    
  5. Check rate limits Let’s Encrypt has rate limits. If exceeded:
    uc caddy deploy --caddyfile Caddyfile.staging
    
    Use the staging environment in your Caddyfile:
    {
      acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
    }
    

Certificate expired

Symptoms: Browser shows “Your connection is not private” error Diagnosis:
# Check certificate expiry
uc service exec caddy ls -la /data/caddy/certificates/
Solutions:
  1. Force renewal
    # Restart Caddy to trigger renewal
    uc service scale caddy 0
    uc service scale caddy 1
    
  2. Check Caddy logs
    uc service logs caddy | grep -i renew
    
  3. Delete old certificate
    uc service exec caddy rm -rf /data/caddy/certificates/acme-v02.api.letsencrypt.org-directory/app.example.com
    
    Then restart Caddy

Cluster state issues

Machine shows as “Down” but it’s running

Symptoms: uc machine ls shows a machine as Down, but you can SSH into it Diagnosis:
# Check daemon status on the machine
ssh [email protected]
sudo systemctl status uncloud
sudo systemctl status uncloud-corrosion
Solutions:
  1. Restart services
    sudo systemctl restart uncloud
    sudo systemctl restart uncloud-corrosion
    
  2. Check Corrosion state
    sudo journalctl -u uncloud-corrosion -f
    
    Look for replication errors
  3. Verify cluster connectivity
    sudo wg show
    
    Check that WireGuard peers are connected

Services not showing up after deployment

Symptoms: uc service ls doesn’t show a newly deployed service Diagnosis:
# Check deployment status
uc service ls
uc service inspect SERVICE_NAME
Solutions:
  1. Wait for state propagation The cluster uses eventual consistency. Wait 10-30 seconds and retry:
    uc service ls
    
  2. Check daemon logs
    ssh [email protected]
    sudo journalctl -u uncloud | tail -50
    
  3. Verify Corrosion is running
    sudo systemctl status uncloud-corrosion
    

Debug commands

Useful commands for debugging issues:

Network debugging

# Ping another machine's WireGuard IP
ping 10.210.1.1

# Test DNS resolution
nslookup service.internal
dig service.internal

# Check routing table
ip route show

# Check WireGuard status
sudo wg show

# Test container connectivity
uc service exec myservice ping 10.210.1.5
uc service exec myservice curl http://other-service.internal:8000

Service debugging

# View service details
uc service inspect myservice

# Check container logs
uc service logs myservice
uc service logs -f myservice  # Follow
uc service logs --since 1h myservice  # Last hour

# Execute commands in container
uc service exec myservice ps aux
uc service exec myservice env
uc service exec myservice cat /etc/resolv.conf

# Check port bindings
uc service exec myservice netstat -tulpn

Machine debugging

# Check machine status
uc machine ls

# View daemon logs
ssh [email protected] sudo journalctl -u uncloud -f

# Check Corrosion logs
ssh [email protected] sudo journalctl -u uncloud-corrosion -f

# Check Docker logs
ssh [email protected] sudo journalctl -u docker -f

# View system resources
ssh [email protected] free -h
ssh [email protected] df -h
ssh [email protected] top

Caddy debugging

# View Caddy config
uc caddy config

# Check Caddy logs
uc service logs caddy
uc service logs caddy | grep -i error

# Check certificate files
uc service exec caddy ls -la /data/caddy/certificates/

# Test Caddy admin API
uc service exec caddy curl --unix-socket /run/caddy/admin.sock http://localhost/config/

Where to get help

GitHub Issues

Report bugs and request features: GitHub: github.com/psviderski/uncloud/issues When opening an issue, include:
  • Uncloud version (uc version)
  • Machine OS and version
  • Complete error messages
  • Steps to reproduce
  • Relevant logs

Discord Community

Join the Uncloud Discord for:
  • Quick questions
  • General discussions
  • Community support
  • Feature ideas
Discord: discord.gg/eR35KQJhPu

GitHub Discussions

For longer-form discussions: Discussions: github.com/psviderski/uncloud/discussions Good for:
  • How-to questions
  • Architecture discussions
  • Sharing setups and configurations
  • Feature proposals

Common error messages

Cause: Another process is using the portSolution:
# Find the process
sudo netstat -tulpn | grep :PORT

# Stop it
sudo kill PROCESS_ID

# Or use a different port
uc run -p 8080:80 myapp
Cause: SSH connection failedSolutions:
  1. Check SSH access manually:
  2. Verify SSH key:
    uc machine add [email protected] --ssh-key ~/.ssh/id_rsa
    
  3. Check firewall rules:
    sudo ufw allow 22/tcp
    
Cause: Machine doesn’t exist in the clusterSolution:
# List all machines
uc machine ls

# Add the machine
uc machine add [email protected] --name machine1
Cause: Service doesn’t exist or hasn’t propagated yetSolutions:
  1. List all services:
    uc service ls
    
  2. Wait for state propagation (10-30 seconds)
  3. Check service name spelling
Cause: No machines have public IPs or are behind firewallsSolutions:
  1. Set public IP on a machine:
    uc machine update machine1 --public-ip 203.0.113.10
    
  2. Open firewall ports (80, 443):
    sudo ufw allow 80/tcp
    sudo ufw allow 443/tcp
    
  3. Configure port forwarding if behind NAT
Cause: Hit Let’s Encrypt rate limit (50 certs/week)Solutions:
  1. Use staging environment:
    {
      acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
    }
    
  2. Wait for quota reset (1 week)
  3. Use DNS challenge instead of HTTP challenge (requires custom Caddyfile)

Next steps

Machine Management

Learn about machine operations

Monitoring

Set up monitoring and logging

Build docs developers (and LLMs) love