Troubleshooting Guide

Overview

This guide helps you diagnose and fix common issues with your Headscale deployment. Use the quick diagnostics section first, then refer to specific problem categories.

Quick Diagnostics

Run these commands to quickly assess system health:

# Check service status
docker compose ps

# Test health endpoint
curl http://localhost:8000/health

# View recent logs
docker compose logs --tail 50 headscale

# Check database connectivity
docker exec headscale-db pg_isready -U headscale

# Verify nodes
docker exec headscale headscale nodes list

{
  "status": "pass"
}

Service Issues

Headscale Won’t Start

Database Connection Failed

Symptoms:

ERROR failed to connect to database
ERROR dial tcp 172.19.0.2:5432: connection refused

Diagnosis:

# Check PostgreSQL status
docker compose ps postgres

# Check database logs
docker compose logs postgres

# Test connection
docker exec headscale-db pg_isready -U headscale

Solutions:

Verify credentials match:

# Check .env
grep POSTGRES_PASSWORD .env

# Check config.yaml
grep "pass:" config/config.yaml

Restart database:

docker compose restart postgres
# Wait for health check
sleep 10
docker compose restart headscale

Check database initialization:

# View PostgreSQL logs during startup
docker compose logs postgres | grep "database system is ready"

Port Already in Use

Symptoms:

Error starting userland proxy: listen tcp 0.0.0.0:80: bind: address already in use

Diagnosis:

# Check what's using the port
sudo lsof -i :80
sudo lsof -i :443
sudo lsof -i :9090

# Or with netstat
sudo netstat -tulpn | grep :80

Solutions:

Stop conflicting service:

# Apache
sudo systemctl stop apache2

# nginx (system)
sudo systemctl stop nginx

# Other Headscale instance
docker ps -a | grep headscale
docker compose -f /other/path/docker-compose.yml down

Change ports in docker-compose.yml:

nginx:
  ports:
    - "8000:80"  # Use port 8000 instead
    - "8443:443"

Volume Mount Permissions

Symptoms:

ERROR failed to open database file: permission denied
ERROR cannot write to /var/lib/headscale

Diagnosis:

# Check ownership
ls -la data/ config/

# Check container user
docker exec headscale id

Solutions:

# Fix ownership
sudo chown -R $USER:$USER data/ config/

# Fix permissions
chmod 700 data/
chmod 600 config/config.yaml

# Restart
docker compose restart headscale

Configuration Syntax Error

Symptoms:

ERROR failed to parse configuration
ERROR yaml: unmarshal errors

Diagnosis:

# Validate YAML syntax
python3 -c "import yaml; yaml.safe_load(open('config/config.yaml'))"

# Or use yq
yq eval config/config.yaml

Solutions:

Check for common YAML errors:
- Incorrect indentation (use spaces, not tabs)
- Missing colons
- Unquoted special characters
- Mismatched brackets
Compare with working example:

diff config/config.yaml config/config.yaml.example

Restore from backup:

cp config/config.yaml.bak config/config.yaml

nginx Issues

502 Bad Gateway

Symptoms: Browser shows “502 Bad Gateway” when accessing HeadscaleDiagnosis:

# Check if Headscale is running
docker compose ps headscale

# Check nginx logs
docker compose logs nginx | grep error

# Test backend connectivity
docker exec nginx wget -qO- http://headscale:8080/health

Solutions:

Verify Headscale is healthy:

docker compose logs headscale | tail -20
docker exec headscale headscale health

Check network connectivity:

# Verify containers are on same network
docker network inspect headscale_headscale-network

# Test DNS resolution
docker exec nginx nslookup headscale

Restart services:

docker compose restart headscale nginx

SSL Certificate Errors

Symptoms:

SSL certificate problem: unable to get local issuer certificate
nginx: [emerg] cannot load certificate

Diagnosis:

# Check certificate files
ls -la certbot/conf/live/*/

# Verify certificate validity
openssl x509 -in certbot/conf/live/yourdomain.com/fullchain.pem -noout -dates

# Test SSL configuration
docker exec nginx nginx -t

Solutions:

Obtain certificate:

# Initialize Let's Encrypt
docker compose run --rm certbot certonly \
  --webroot \
  --webroot-path=/var/www/certbot \
  --email [email protected] \
  --agree-tos \
  -d headscale.example.com

Fix permissions:

chmod 644 certbot/conf/live/*/fullchain.pem
chmod 600 certbot/conf/live/*/privkey.pem

Use HTTP for development:

# Create development override
cp docker-compose.override.example.yml docker-compose.override.yml
docker compose up -d

404 Not Found

Symptoms: All requests return 404, especially Headplane at /adminSolutions:

Verify URL includes trailing slash:

✅ http://localhost:3001/admin/
❌ http://localhost:3001/admin

Check nginx location blocks:

# Verify proxy_pass configuration
location /admin/ {
    proxy_pass http://headplane:3000/admin/;
}

Access Headplane directly:

http://localhost:3001/admin/

Headplane Issues

Headplane Won't Load

Symptoms: Headplane shows blank page or connection errorDiagnosis:

# Check container status
docker compose ps headplane

# Check logs
docker compose logs headplane --tail 50

# Test health endpoint
curl http://localhost:3001/health

Solutions:

Verify API key configuration:

# Check if API key is set
grep api_key headplane/config.yaml

# Generate new API key if needed
docker exec headscale headscale apikeys create --expiration 999d

# Update headplane/config.yaml
nano headplane/config.yaml

Check cookie secret length:

headplane/config.yaml

server:
  cookie_secret: "exactly32characterslong12345678"  # Must be exactly 32 characters

Verify Headscale connectivity:

# Test from Headplane container
docker exec headplane wget -qO- http://headscale:8080/health

Restart Headplane:

docker compose restart headplane

Favicon 404 Errors

Symptoms: Logs show repeated 404 errors for favicon.ico

Error: No route matches URL "/admin/favicon.ico"

Solution: This is cosmetic and can be safely ignored. To fix:

docker-compose.yml

headplane:
  volumes:
    - ./headplane:/etc/headplane
    - ./favicon.ico:/app/public/favicon.ico:ro  # Add custom favicon

API Authentication Failed

Symptoms:

Authentication failed: invalid API key
Unauthorized: bearer token missing

Solutions:

Verify API key is valid:

# List API keys
docker exec headscale headscale apikeys list

# Test API key
curl -H "Authorization: Bearer YOUR_API_KEY" \
  http://localhost:8000/api/v1/user

Regenerate and update:

# Generate new key
new_key=$(docker exec headscale headscale apikeys create --expiration 999d | grep -o 'hs_[a-zA-Z0-9_-]*')

# Update Headplane config
echo "api_key: $new_key" >> headplane/config.yaml

# Restart
docker compose restart headplane

Node Connection Issues

Nodes Won’t Connect

Authentication Failed

Symptoms:

Authentication failed: invalid key
Node registration rejected

Solutions:

Verify pre-auth key:

# List keys for user
docker exec headscale headscale preauthkeys list --user myuser

# Check expiration
# Create new key if expired
docker exec headscale headscale preauthkeys create \
  --user myuser \
  --reusable \
  --expiration 24h

Verify server URL:

# On client
tailscale status

# Should show your server URL
# If not, reconnect:
tailscale up --login-server https://headscale.example.com --authkey KEY

Cannot Reach Server

Symptoms:

failed to connect to control server
dial tcp: lookup headscale.example.com: no such host

Diagnosis:

# Check DNS resolution
nslookup headscale.example.com
dig headscale.example.com

# Test connectivity
curl https://headscale.example.com/health

# Check firewall
sudo iptables -L -n | grep -E "80|443"

Solutions:

Verify DNS records:

# Should return your server IP
dig +short headscale.example.com

Check firewall:

# Allow HTTP/HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw status

Test from client:

# Should return {"status":"pass"}
curl -k https://headscale.example.com/health

Node Registered but Offline

Symptoms: Node appears in list but shows as offline

docker exec headscale headscale nodes list
# ID | Hostname  | Status
# 1  | laptop    | offline

Solutions:

Check node status on client:

tailscale status
tailscale netcheck

Verify routes:

# On server
docker exec headscale headscale routes list

# Enable if needed
docker exec headscale headscale routes enable --route-id ID

Force reconnection on client:

tailscale down
tailscale up --login-server https://headscale.example.com

Network Connectivity

Nodes Can't Ping Each Other

Diagnosis:

# From one node, try to ping another
ping 100.64.0.2

# Check routes
tailscale status

# Check ACL policies
cat config/policy.json

Solutions:

Verify ACL allows traffic:

config/policy.json

{
  "acls": [
    {
      "action": "accept",
      "src": ["*"],
      "dst": ["*:*"]
    }
  ]
}

Check firewall on nodes:

# Allow Tailscale interface
sudo ufw allow in on tailscale0

Verify IP addresses:

# On server
docker exec headscale headscale nodes list

# On client
ip addr show tailscale0

MagicDNS Not Working

Symptoms:

ping hostname.headscale.net
# ping: unknown host

Diagnosis:

# Check DNS configuration
grep -A5 "dns:" config/config.yaml

# On client
tailscale status
resolvectl status tailscale0  # Linux
scutil --dns | grep tailscale  # macOS

Solutions:

Enable MagicDNS:

config/config.yaml

dns:
  magic_dns: true
  base_domain: headscale.net

Restart Headscale:

docker compose restart headscale

Reconnect clients:

tailscale down && tailscale up --accept-dns

DERP Relay Issues

Symptoms:

failed to connect to DERP server
no DERP home; connections may be slow

Diagnosis:

# Check DERP configuration
grep -A10 "derp:" config/config.yaml

# Test from client
tailscale netcheck

Solutions:

Use Tailscale’s DERP servers:

config/config.yaml

derp:
  urls:
    - https://controlplane.tailscale.com/derpmap/default
  auto_update_enabled: true
  update_frequency: 24h

Enable embedded DERP:

config/config.yaml

derp:
  server:
    enabled: true
    region_id: 999
    region_code: "custom"
    stun_listen_addr: "0.0.0.0:3478"

Restart and test:

docker compose restart headscale
# On client
tailscale netcheck

Database Issues

Database Connection Pool Exhausted

Symptoms:

ERROR database connection pool exhausted
ERROR too many clients already

Solutions:

Increase connection pool:

config/config.yaml

database:
  postgres:
    max_open_conns: 20  # Increase from 10
    max_idle_conns: 10
    conn_max_idle_time_secs: 3600

Check for connection leaks:

# Count active connections
docker exec headscale-db psql -U headscale -c \
  "SELECT count(*) FROM pg_stat_activity;"

# Kill idle connections
docker exec headscale-db psql -U headscale -c \
  "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND state_change < current_timestamp - INTERVAL '5 minutes';"

Restart services:

docker compose restart headscale postgres

Database Corruption

Symptoms:

ERROR database disk image is malformed
ERROR invalid page header

For SQLite:

# Check integrity
docker exec headscale sqlite3 /var/lib/headscale/db.sqlite \
  "PRAGMA integrity_check;"

# Attempt repair
docker exec headscale sqlite3 /var/lib/headscale/db.sqlite \
  "PRAGMA integrity_check; VACUUM;"

# Restore from backup if repair fails
tar -xzf backup-YYYYMMDD.tar.gz data/db.sqlite

For PostgreSQL:

# Check for corruption
docker exec headscale-db psql -U headscale -c \
  "SELECT * FROM pg_stat_database WHERE datname = 'headscale';"

# Restore from backup
cat backup.sql | docker exec -i headscale-db psql -U headscale

Slow Database Queries

Diagnosis:

# Check slow queries (PostgreSQL)
docker exec headscale-db psql -U headscale -c \
  "SELECT query, mean_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;"

# Check database size
docker exec headscale-db psql -U headscale -c \
  "SELECT pg_size_pretty(pg_database_size('headscale'));"

Solutions:

Vacuum database:

docker exec headscale-db psql -U headscale -c "VACUUM ANALYZE;"

Clean up old data:

# Expire offline nodes
docker exec headscale headscale nodes expire --all-offline

# Remove expired pre-auth keys
docker exec headscale headscale preauthkeys list --user myuser

Performance Issues

High Resource Usage

High CPU Usage

Diagnosis:

# Monitor CPU
docker stats --no-stream

# Check for runaway processes
docker exec headscale top -bn1

Solutions:

Check for excessive logging:

config/config.yaml

log:
  level: info  # Change from debug

Reduce update frequency:

config/config.yaml

node_update_check_interval: 30s  # Increase from 10s

Restart service:

docker compose restart headscale

High Memory Usage

Diagnosis:

# Check memory usage
docker stats headscale --no-stream

# Check for memory leaks
watch -n 5 'docker stats headscale --no-stream'

Solutions:

Set memory limits:

docker-compose.yml

headscale:
  deploy:
    resources:
      limits:
        memory: 512M

Reduce connection pool:

config/config.yaml

database:
  postgres:
    max_open_conns: 5
    max_idle_conns: 2

Restart if memory leak detected:

docker compose restart headscale

Disk Space Full

Symptoms:

no space left on device
database write failed

Diagnosis:

# Check disk usage
df -h
du -sh data/ logs/ backups/

# Check Docker disk usage
docker system df -v

Solutions:

Clean up Docker:

# Remove unused images
docker image prune -a

# Remove unused volumes
docker volume prune

# Full cleanup
docker system prune -a --volumes

Clean up logs:

# Truncate nginx logs
: > logs/nginx/access.log
: > logs/nginx/error.log

# Configure log rotation
sudo nano /etc/docker/daemon.json

Clean up old backups:

find backups/ -name "*.sql" -mtime +7 -delete
find backups/ -name "*.tar.gz" -mtime +7 -delete

Debug Mode

Enable detailed logging for troubleshooting:

config/config.yaml

log:
  format: text
  level: debug  # trace, debug, info, warn, error

# Restart with debug logging
docker compose restart headscale

# Watch debug logs
docker compose logs -f headscale

Debug mode generates large log files. Disable after troubleshooting by changing level back to info.

Getting Help

If issues persist:

Collect diagnostic information:

# System info
docker compose version
docker version
uname -a

# Service status
docker compose ps
docker compose logs --tail 100 > logs.txt

# Configuration (redact passwords)
cat config/config.yaml | grep -v pass > config-sanitized.yaml

Check documentation:
- Headscale Documentation
- Tailscale Knowledge Base
Community support:
- Headscale GitHub Discussions
- Tailscale Community Forum
- Discord communities

Monitoring

Set up monitoring to catch issues early

Security

Security best practices and hardening

Backup & Restore

Restore from backup if needed

Updates

Keep services updated

Get Started

Deployment

Configuration

Guides

Operations

Overview

Quick Diagnostics

Service Issues

Headscale Won’t Start

nginx Issues

Headplane Issues

Node Connection Issues

Nodes Won’t Connect

Network Connectivity

Database Issues

Performance Issues

High Resource Usage

Debug Mode

Getting Help

Monitoring

Security

Backup & Restore

Updates

Build docs developers (and LLMs) love

Get Started

Deployment

Configuration

Guides

Operations

​Overview

​Quick Diagnostics

​Service Issues

​Headscale Won’t Start

​nginx Issues

​Headplane Issues

​Node Connection Issues

​Nodes Won’t Connect

​Network Connectivity

​Database Issues

​Performance Issues

​High Resource Usage

​Debug Mode

​Getting Help

​Related Resources

Monitoring

Security

Backup & Restore

Updates

Build docs developers (and LLMs) love

Overview

Quick Diagnostics

Service Issues

Headscale Won’t Start

nginx Issues

Headplane Issues

Node Connection Issues

Nodes Won’t Connect

Network Connectivity

Database Issues

Performance Issues

High Resource Usage

Debug Mode

Getting Help

Related Resources