Connection Lifecycle
1. Peer Registration
When a NetBird client starts, it goes through the following registration process:- Authentication: Client authenticates with Management Service using SSO or setup key
- Interface Creation: Local WireGuard interface is created with assigned IP address
- System Info: Client sends system metadata (OS, version, hostname, etc.)
- Network Map: Management Service returns initial network map with:
- List of peers to connect to
- STUN/TURN server addresses
- Access policies and routes
- DNS configuration
2. Peer Discovery with ICE
NetBird uses the WebRTC ICE (Interactive Connectivity Establishment) protocol implemented by pion/ice to discover connection paths.ICE Candidate Types
Each peer gathers multiple connection candidates:- Host Candidates
- Server Reflexive (srflx)
- Relay Candidates
Local network interfaces - Direct IP addresses from network interfaces
- Most preferred when both peers are on same LAN
- Zero latency overhead
- Example:
192.168.1.100:51820
Candidate Discovery Process
3. Signaling Through Signal Service
Once peers have their candidates, they exchange them through the Signal Service using an offer/answer pattern: Key Points:- All signaling messages are encrypted before being sent through Signal Service
- Signal Service only forwards messages; it cannot decrypt them
- Candidates are exchanged incrementally as they’re discovered (trickle ICE)
4. NAT Traversal and Connectivity Checks
After candidate exchange, both peers perform connectivity checks to find the best path:ICE Connectivity Check Process
- Candidate Pairing: Each peer creates pairs of local and remote candidates
- Priority Calculation: Pairs are prioritized (host > srflx > relay)
- STUN Binding Requests: Peers send STUN binding requests to each candidate pair
- Connectivity Established: First successful pair becomes the selected connection
NAT Traversal Scenarios
Both peers behind different NATs (most common)
Both peers behind different NATs (most common)
Solution: STUN + UDP hole punching
- Both peers discover their public endpoints via STUN
- Exchange srflx candidates through Signal Service
- Simultaneously send packets to each other’s public endpoint
- NATs create symmetric mappings allowing bidirectional traffic
One or both peers behind symmetric NAT
One or both peers behind symmetric NAT
Solution: TURN relaySymmetric NATs assign different public ports for different destinations, breaking hole punching.
- Peers connect through TURN relay server
- Relay forwards packets between peers
- Still maintains end-to-end WireGuard encryption
Same local network
Same local network
Solution: Direct host candidate connection
- Peers use local IP addresses directly
- Zero-hop connection, lowest latency
- Bypasses NAT entirely
Carrier-grade NAT (CGNAT)
Carrier-grade NAT (CGNAT)
Solution: TURN relay requiredMobile networks and some ISPs use CGNAT (multiple layers of NAT).
- Direct traversal usually fails
- Relay connection provides fallback
- NetBird automatically detects and falls back to relay
5. WireGuard Tunnel Establishment
Once ICE establishes a network path, NetBird creates the WireGuard tunnel:- Endpoint Configuration: Selected ICE candidate pair becomes WireGuard endpoint
- Allowed IPs: Configure which traffic should route through this peer
- WireGuard Handshake: Standard WireGuard key exchange happens over the ICE connection
- Data Flow: Encrypted packets flow directly between peers
6. Connection Monitoring and Failover
NetBird continuously monitors connection health and can switch between connection types:- Health Checks: Regular WireGuard handshake monitoring
- Reconnection: Automatic reconnection on failure
- Path Selection: Switches from relay to direct when available
- Network Changes: Detects network changes and re-establishes connections
Parallel Connection Attempts
NetBird attempts both direct (ICE) and relay connections simultaneously:- Direct P2P (ICE): Always preferred for performance
- Relay: Used as immediate fallback if ICE takes too long
- Upgrade: Can upgrade from relay to direct if P2P succeeds later
The timeout for initial connection attempts is randomized between 30-45 seconds to prevent thundering herd problems in large networks.
Network Updates and Reconnections
When network conditions change, NetBird handles reconnection automatically:Network Change Detection
- WiFi to cellular transition
- VPN connection/disconnection
- IP address changes
- Network interface up/down events
Management Service Synchronization
The Management Service pushes real-time updates through a streaming gRPC connection:- New peers added to network
- Peers removed from network
- Access policy changes
- Route updates
- DNS configuration changes
- STUN/TURN server changes
Performance Optimizations
eBPF-Based Proxy
On Linux, NetBird can use eBPF for efficient packet handling:- Reduces context switches between kernel and userspace
- Transparent connection tracking
- Minimal CPU overhead for high-throughput scenarios
UDP Mux
Multiple peer connections share a single UDP port:- Reduces number of required firewall ports
- Simplifies NAT traversal
- Enables better connection tracking
Connection Type Optimization
NetBird continuously evaluates connection quality:- Prefers direct connections over relayed
- Monitors latency and packet loss
- Automatically switches to better path when available
Troubleshooting Connection Issues
Peers stuck on relay connection
Peers stuck on relay connection
Causes:
- Strict firewall blocking UDP
- Symmetric NAT on both sides
- STUN server unreachable
- Check firewall allows UDP outbound
- Verify STUN servers are accessible
- Use
netbird statusto check connection type
Connection timeout
Connection timeout
Causes:
- Firewall blocking Signal/Management service
- No available TURN servers
- Network policy blocking VPN traffic
- Verify connectivity to Management Service
- Check TURN server configuration
- Review firewall logs for blocked connections
Frequent reconnections
Frequent reconnections
Causes:
- Unstable network
- NAT session timeout too short
- Keepalive interval too long
- Adjust WireGuard PersistentKeepalive setting
- Check for network interface flapping
- Review network quality metrics
Next Steps
Architecture Overview
High-level architecture and design principles
Components
Detailed component responsibilities and interactions