Overview
The networking implementation spans several crates:- P2P Layer (
rs/p2p/): Intra-subnet node communication - HTTP Endpoints (
rs/http_endpoints/): External client API - XNet (distributed): Cross-subnet messaging
- Transport (various): QUIC, TCP, memory transports
Network Architecture
P2P Networking
Fromrs/p2p/README.adoc:1, the P2P layer is responsible for message delivery within subnets.
P2P Overview
Key characteristics:- Each subnet operates a separate P2P network
- Nodes within a subnet send messages to each other
- Multiple components use P2P (consensus, state sync, etc.)
- Built on modern QUIC protocol
rs/p2p/
P2P Components
Artifact Manager
The Artifact Manager coordinates P2P distribution:- Manage artifact pools (consensus, DKG, ECDSA)
- Coordinate downloads from peers
- Handle uploads to peers
- Apply prioritization and filtering
- Provide backpressure
JoinGuard handles.
Consensus Manager
Consensus-specific P2P logic:- Advertises consensus artifacts (blocks, notarizations, etc.)
- Requests missing artifacts
- Validates received artifacts
- Provides priority-based download
rs/p2p/consensus_manager/
State Sync Manager
Handles state synchronization between nodes:
Location:
rs/p2p/state_sync_manager/
Peer Manager
Manages peer connections and discovery:- Maintains connections to subnet peers
- Handles peer addition/removal
- Monitors connection health
- Applies connection limits
- Manages peer reputation
rs/p2p/peer_manager/
QUIC Transport
QUIC (Quick UDP Internet Connections) provides the transport layer: QUIC Benefits:Low Latency
0-RTT connection establishment
Multiplexing
Multiple streams per connection
Security
Built-in TLS 1.3 encryption
Reliability
Packet loss recovery
rs/p2p/quic_transport/
Artifact Downloader
Downloads artifacts from peers:- Parallel downloads from multiple peers
- Chunk-based transfer for large artifacts
- Priority-based scheduling
- Bandwidth management
- Retry logic with backoff
rs/p2p/artifact_downloader/
XNet Communication
XNet (Cross-Network) enables communication between subnets.XNet Architecture
XNet Streams
Each subnet pair maintains bidirectional streams:- Stream: Ordered sequence of messages
- Stream Index: Position in stream
- Stream Header: Metadata (indices, signals)
- Stream Slice: Subset of stream messages
- Certification: Cryptographic proof of stream validity
rs/state_manager/src/lib.rs:68:
Stream Encoding
Streams are efficiently encoded for transfer:- Compact binary format (Protobuf)
- Incremental updates
- Hash-based verification
- Garbage collection of old messages
rs/state_manager/src/stream_encoding.rs
Certified Stream Store
Manages certified XNet streams:encode_certified_stream_slice: Create certified slicedecode_certified_stream_slice: Parse received slicedecode_stream_slice: Validate and extract messages
XNet Payload Builder
Constructs XNet payloads for consensus blocks:- Select messages from XNet streams
- Create batch payloads
- Apply size and count limits
- Prioritize critical messages
XNet HTTP Endpoint
HTTP endpoint for cross-subnet communication:- Serve certified stream slices
- Accept stream slice deliveries
- Provide stream metadata
- Handle authentication
rs/http_endpoints/xnet/
HTTP Endpoints
HTTP endpoints expose the IC API to external clients.Endpoint Types
Fromrs/http_endpoints/README.adoc:98:
- Public API
- Metrics
- Status
- XNet
Implements the IC Interface Specification:
/api/v2/canister/<canister_id>/call: Submit update calls/api/v2/canister/<canister_id>/query: Execute queries/api/v2/canister/<canister_id>/read_state: Read certified state/api/v3/canister/<canister_id>/call: V3 API with enhancements
rs/http_endpoints/public/Connection Management
Fromrs/http_endpoints/README.adoc:10, connection management uses:
Nftables Firewall
ReplicaOS uses nftables for:- Restrict inbound traffic to registry nodes
- Limit simultaneous connections per IP
- Rate limit connection establishment
- Protect against protocol attacks
- Prevent resource exhaustion
Idle Connection Detection
Fromrs/http_endpoints/README.adoc:25:
connection_read_timeout_seconds: Drop idle connections- TCP keepalive with ReplicaOS defaults
- Prevents dead connections
- Guards against connection holding attacks
Queue Management
Fromrs/http_endpoints/README.adoc:36, endpoints use thread-per-request pattern:
Features:
- Bounded-size request queue
- Threadpool for blocking operations
- Tokio oneshot channels for results
- Request cancellation support
- Non-blocking async runtime
Load Shedding
Fromrs/http_endpoints/README.adoc:46, when overloaded:
Benefits:
- Fail early and cheaply
- Prevent cascading failures
- Maintain throughput under load
- Compatible with load balancers
Request Timeout
Fromrs/http_endpoints/README.adoc:58:
Timeout prevents:
- Connection drops during long operations
- Blocking clients indefinitely
- Resource leaks
Request Validation
Fromrs/http_endpoints/README.adoc:92:
Request body exceeds configured limit
Request did not complete within timeout
Fairness
Fromrs/http_endpoints/README.adoc:81, fairness is achieved through:
- Bounded request queues per endpoint
- Fair thread/task scheduler
- Equal treatment at capacity
- No preferential processing
Boundary Nodes
Boundary nodes provide the entry point for users:Boundary Node Functions
TLS Termination
Handle HTTPS connections from users
Load Balancing
Distribute requests across replicas
Caching
Cache responses for performance
DDoS Protection
Rate limiting and filtering
rs/boundary_node/
Network Security
Transport Security
- TLS 1.3: All external connections encrypted
- QUIC: Built-in encryption for P2P
- Mutual Authentication: Nodes authenticate each other
- Certificate Management: Automatic cert rotation
Access Control
- Registry-based: Only registered nodes can connect
- Subnet Isolation: Subnets are separate networks
- Firewall Rules: Nftables restrict access
- Rate Limiting: Prevent abuse
Attack Mitigation
- DDoS Protection: Rate limits and connection limits
- Resource Exhaustion: Bounded queues and timeouts
- Slowloris: Connection timeouts
- Amplification: Request size limits
Performance Optimizations
Parallel Downloads
State sync and artifacts download in parallel:- Multiple peers simultaneously
- Multiple chunks per artifact
- Adaptive concurrency
- Bandwidth sharing
Efficient State Sync
Fromrs/state_manager/src/state_sync.rs:
- Merkle tree-based incremental sync
- Only download changed chunks
- Parallel chunk validation
- Resume interrupted syncs
Connection Pooling
Reuse connections across requests:- HTTP/2 multiplexing
- QUIC streams
- Connection warmup
- DNS caching
Batching
Group messages for efficiency:- Consensus batches
- XNet stream slices
- Artifact advertisements
- State sync chunks
Monitoring and Metrics
Network Metrics
Key metrics exposed:Health Checks
Endpoints provide health status:- Connection count
- Queue depth
- Error rates
- Latency percentiles
Configuration
P2P Configuration
Key settings:- Peer list (from registry)
- Port numbers
- Connection limits
- Bandwidth limits
- Timeouts
HTTP Configuration
Key settings:- Listen addresses and ports
- TLS certificates
- Request size limits
- Timeout values
- Queue sizes
Transport Configuration
QUIC settings:- Congestion control algorithm
- Flow control windows
- Keep-alive intervals
- Maximum stream count
Best Practices
Implement load shedding early in the request pipeline for graceful degradation (rs/http_endpoints/README.adoc:46)
Further Reading
Replica
Understand replica structure
Consensus
Learn about artifact distribution
Execution
Understand message routing
Overview
Return to architecture overview