Overview
ML Defender implements a production-grade cryptographic pipeline for secure data transmission between components. The system combines ChaCha20-Poly1305 AEAD encryption with LZ4 compression and distributed key management via etcd.
Zero cryptographic errors across 36,000 events in stress testing validates the pipeline’s production readiness.
Architecture
The crypto pipeline operates in two stages:
┌─────────────────────────────────────────────────────────┐
│ ml-detector (Producer) │
│ ↓ │
│ 1. Protobuf serialization (NetworkSecurityEvent) │
│ ↓ │
│ 2. LZ4 compression (with 4-byte size header) │
│ ↓ │
│ 3. ChaCha20-Poly1305 encryption (32-byte key via etcd) │
│ ↓ │
│ 4. ZMQ PUSH (tcp://localhost:5572) │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ firewall-acl-agent (Consumer) │
│ ↓ │
│ 1. ZMQ PULL (receive encrypted+compressed data) │
│ ↓ │
│ 2. ChaCha20-Poly1305 decryption (same 32-byte key) │
│ ↓ │
│ 3. LZ4 decompression (read 4-byte size header) │
│ ↓ │
│ 4. Protobuf parsing (PacketEvent) │
│ ↓ │
│ 5. IPSet/IPTables enforcement (kernel-level blocking) │
└─────────────────────────────────────────────────────────┘
ChaCha20-Poly1305 AEAD Encryption
Algorithm Selection
ML Defender uses ChaCha20-Poly1305 (IETF RFC 8439) for authenticated encryption:
Why ChaCha20-Poly1305?
- ✅ AEAD (Authenticated Encryption with Associated Data) - Integrity + confidentiality in one operation
- ✅ High performance - Faster than AES on CPUs without hardware acceleration
- ✅ Constant-time - Resistant to timing attacks
- ✅ Well-studied - Designed by Daniel J. Bernstein, widely deployed (TLS 1.3, WireGuard)
- ✅ No nonce reuse issues - With proper random nonce generation
Implementation Details
ML Defender uses libsodium (crypto_secretbox_easy) which implements XSalsa20-Poly1305:
// crypto-transport/src/crypto.cpp
std::vector<uint8_t> encrypt(const std::vector<uint8_t>& data,
const std::vector<uint8_t>& key) {
// Validate key size (32 bytes)
if (key.size() != crypto_secretbox_KEYBYTES) {
throw std::runtime_error("Invalid key size");
}
// Generate random nonce (24 bytes for XSalsa20)
std::vector<uint8_t> nonce(crypto_secretbox_NONCEBYTES);
randombytes_buf(nonce.data(), nonce.size());
// Allocate buffer for ciphertext (plaintext + 16-byte MAC)
std::vector<uint8_t> ciphertext(data.size() + crypto_secretbox_MACBYTES);
// Encrypt with ChaCha20-Poly1305
int ret = crypto_secretbox_easy(
ciphertext.data(),
data.data(),
data.size(),
nonce.data(),
key.data()
);
if (ret != 0) {
throw std::runtime_error("ChaCha20 encryption failed");
}
// Return: nonce (24) + ciphertext (N + 16)
std::vector<uint8_t> result;
result.reserve(nonce.size() + ciphertext.size());
result.insert(result.end(), nonce.begin(), nonce.end());
result.insert(result.end(), ciphertext.begin(), ciphertext.end());
return result;
}
Output format:
[24-byte nonce][N-byte ciphertext][16-byte Poly1305 MAC]
Decryption Process
std::vector<uint8_t> decrypt(const std::vector<uint8_t>& encrypted_data,
const std::vector<uint8_t>& key) {
// Minimum size: nonce (24) + MAC (16) = 40 bytes
if (encrypted_data.size() < 40) {
throw std::runtime_error("Encrypted data too short (corrupted?)");
}
// Extract nonce (first 24 bytes)
const uint8_t* nonce = encrypted_data.data();
// Extract ciphertext (remaining bytes)
const uint8_t* ciphertext = encrypted_data.data() + crypto_secretbox_NONCEBYTES;
size_t ciphertext_len = encrypted_data.size() - crypto_secretbox_NONCEBYTES;
// Allocate buffer for plaintext
std::vector<uint8_t> plaintext(ciphertext_len - crypto_secretbox_MACBYTES);
// Decrypt and verify MAC
int ret = crypto_secretbox_open_easy(
plaintext.data(),
ciphertext,
ciphertext_len,
nonce,
key.data()
);
if (ret != 0) {
throw std::runtime_error("Decryption failed (wrong key or corrupted)");
}
return plaintext;
}
Security guarantees:
- ✅ Authentication - MAC verification prevents tampering
- ✅ Confidentiality - ChaCha20 stream cipher protects data
- ✅ Non-malleability - Any modification causes decryption failure
- ✅ Random nonces - No nonce reuse (cryptographically secure RNG)
LZ4 Compression
Compression Strategy
ML Defender uses LZ4 for high-speed compression with minimal overhead:
Why LZ4?
- ✅ Extremely fast - 500 MB/s compression, 1500 MB/s decompression
- ✅ Low CPU overhead - Critical for real-time IDS performance
- ✅ Predictable performance - No worst-case slowdowns
- ✅ Small memory footprint - <64 KB working memory
Problem: LZ4 requires knowing the original decompressed size. Initial implementation used 10x estimation, causing failures.
Solution: Prepend a 4-byte size header (big-endian) to compressed data:
// crypto-transport/include/crypto_transport/crypto_manager.hpp
std::string compress_with_size(const std::string& data) {
// Store original size
uint32_t original_size = static_cast<uint32_t>(data.size());
// Compress data
std::vector<uint8_t> input(data.begin(), data.end());
auto compressed = crypto_transport::compress(input);
// Prepend size header (4 bytes, big-endian)
std::string result(4 + compressed.size(), '\0');
// Write size in big-endian format
result[0] = static_cast<char>((original_size >> 24) & 0xFF);
result[1] = static_cast<char>((original_size >> 16) & 0xFF);
result[2] = static_cast<char>((original_size >> 8) & 0xFF);
result[3] = static_cast<char>(original_size & 0xFF);
// Copy compressed data after header
std::copy(compressed.begin(), compressed.end(), result.begin() + 4);
return result;
}
Decompression:
std::string decompress_with_size(const std::string& data) {
// Validate minimum size (4-byte header + data)
if (data.size() < 5) {
throw std::runtime_error("Invalid compressed data: too small");
}
// Read original size from big-endian header
uint32_t original_size =
(static_cast<uint8_t>(data[0]) << 24) |
(static_cast<uint8_t>(data[1]) << 16) |
(static_cast<uint8_t>(data[2]) << 8) |
static_cast<uint8_t>(data[3]);
// Sanity check (< 100MB)
if (original_size > 100 * 1024 * 1024) {
throw std::runtime_error("Invalid original size in header: " +
std::to_string(original_size) + " bytes (>100MB)");
}
// Extract compressed data (skip 4-byte header)
std::vector<uint8_t> compressed_data(data.begin() + 4, data.end());
// Decompress with EXACT original size
auto decompressed = crypto_transport::decompress(compressed_data, original_size);
return std::string(decompressed.begin(), decompressed.end());
}
Format:
[4-byte size (big-endian)][LZ4 compressed data]
Crypto Seed Exchange via etcd
Key Generation
etcd-server generates a cryptographically secure 32-byte key on startup:
// etcd-server/src/crypto_manager.cpp
std::string CryptoManager::generate_random_seed() {
std::string seed;
seed.resize(KEY_SIZE); // 32 bytes
// Use libsodium for cryptographically secure random
randombytes_buf(reinterpret_cast<unsigned char*>(&seed[0]), KEY_SIZE);
return seed;
}
Key characteristics:
- 32 bytes (256 bits) - Industry standard for symmetric encryption
- Generated with
randombytes_buf() from libsodium (CSPRNG)
- Stored in etcd at
/crypto/ml-detector/tokens and /crypto/firewall/tokens
- Rotated every 24 hours (configurable)
Key Distribution
ml-detector retrieves encryption key:
// sniffer/src/userspace/main.cpp
std::string encryption_seed;
for (int attempt = 1; attempt <= 5; attempt++) {
encryption_seed = etcd_client->get_encryption_seed();
if (!encryption_seed.empty()) {
break;
}
sleep(2); // Wait before retry
}
if (encryption_seed.empty()) {
std::cerr << "FATAL: Failed to get encryption seed after 5 attempts" << std::endl;
exit(1);
}
firewall-acl-agent retrieves decryption key:
// firewall-acl-agent/config/firewall.json
{
"etcd": {
"crypto_token_path": "/crypto/firewall/tokens",
"required_for_encryption": true,
"fallback_mode": "standalone_compressed_only"
}
}
Key Rotation
etcd-server supports automatic key rotation every 24 hours:
bool CryptoManager::should_rotate_key() const {
auto now = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::hours>(
now - key_generation_time_
).count();
// Rotate every 24 hours
return elapsed >= 24;
}
void CryptoManager::rotate_key() {
seed_ = generate_random_seed();
crypto_ = std::make_unique<crypto::CryptoManager>(seed_);
key_generation_time_ = std::chrono::system_clock::now();
std::cout << "[CRYPTO] 🔄 Encryption key rotated" << std::endl;
}
Rotation process:
- Generate new 32-byte key
- Update etcd at
/crypto/*/tokens
- Notify subscribed components (via etcd watch)
- Components fetch new key and restart crypto pipeline
Pipeline Validation
Stress Test Results (Day 52)
36,000 events processed with ZERO errors:
# Metrics from firewall-acl-agent stress testing
crypto_errors: 0 # ✅ Perfect encryption/decryption
decompression_errors: 0 # ✅ Perfect LZ4 pipeline
protobuf_parse_errors: 0 # ✅ Perfect message parsing
Test breakdown:
| Test | Events | Rate | Crypto Errors | Result |
|---|
| 1 | 1,000 | 42.6/sec | 0 | ✅ PASS |
| 2 | 5,000 | 94.9/sec | 0 | ✅ PASS |
| 3 | 10,000 | 176.1/sec | 0 | ✅ PASS |
| 4 | 20,000 | 364.9/sec | 0 | ✅ PASS |
Integration Testing
End-to-end validation:
# Terminal 1: Start etcd-server
cd etcd-server/build
sudo ./etcd_server
# Terminal 2: Start firewall-acl-agent
cd firewall-acl-agent/build
sudo ./firewall-acl-agent -c ../config/firewall.json
# Terminal 3: Inject synthetic events
cd tools/build
./synthetic_ml_output_injector 1000 50
# Verify successful decryption in logs
tail -f /vagrant/logs/lab/firewall-agent.log
# Output:
# [CRYPTO] 🔓 Decryption successful: 1245 bytes
# [LZ4] Decompressed: 1245 -> 3456 bytes
# [IPSet] Added 192.168.1.100 to blacklist
Configuration
ml-detector (Encryption Enabled)
// ml-detector/config/ml_detector_config.json
{
"transport": {
"compression": {
"enabled": true,
"algorithm": "lz4"
},
"encryption": {
"enabled": true,
"etcd_token_required": true,
"algorithm": "chacha20-poly1305",
"key_size": 256
}
},
"etcd": {
"enabled": true,
"endpoints": ["localhost:2379"],
"crypto_token_path": "/crypto/ml-detector/tokens"
}
}
firewall-acl-agent (Decryption Enabled)
// firewall-acl-agent/config/firewall.json
{
"transport": {
"compression": {
"enabled": true,
"decompression_only": true,
"algorithm": "lz4"
},
"encryption": {
"enabled": true,
"decryption_only": true,
"etcd_token_required": true,
"algorithm": "chacha20-poly1305",
"key_size": 256
}
},
"etcd": {
"enabled": true,
"endpoints": ["localhost:2379"],
"crypto_token_path": "/crypto/firewall/tokens"
}
}
etcd-server (Key Management)
// etcd-server/config/etcd-server.json
{
"security": {
"encryption_enabled": true,
"compression_enabled": true
},
"heartbeat": {
"enabled": true,
"interval_seconds": 30
}
}
Key Management Best Practices
Secure Key Storage
DO NOT hardcode encryption keys in configuration files. Always retrieve keys from etcd or secure key management systems (KMS).
Current implementation:
- ✅ Keys stored in etcd (in-memory, not persisted to disk)
- ✅ Retrieved via authenticated etcd API
- ✅ Rotated automatically every 24 hours
- ✅ Hex-encoded for safe transmission (64-character string)
Future enhancements (roadmap):
- Integrate with HashiCorp Vault or AWS KMS
- Implement key versioning for graceful rotation
- Add key derivation using HKDF for per-component keys
- Support TLS for etcd client connections
Key Rotation Policy
Recommended rotation schedule:
- Production: 24 hours (current default)
- High-security: 12 hours or less
- Development: 7 days (reduced operational overhead)
Rotation triggers:
- Scheduled interval reached
- Security incident detected
- Administrator manual rotation
- Component compromise suspected
Key Compromise Response
If a key is compromised:
- Immediately rotate key via etcd
- Restart all ML Defender components
- Review logs for unauthorized access
- Perform forensic analysis (RAG ingester queries)
- Update security policies and access controls
Encryption Overhead
Benchmarks (Day 52 stress testing):
| Operation | Throughput | Latency | CPU Impact |
|---|
| Encryption | 364 msg/sec | <5 ms | +2-3% |
| Decryption | 364 msg/sec | <5 ms | +2-3% |
| Compression | 500 MB/sec | <1 ms | +1-2% |
| Decompression | 1500 MB/sec | <0.5 ms | +0.5-1% |
Typical message sizes:
- Protobuf (uncompressed): 500-2000 bytes
- After LZ4 compression: 200-800 bytes (60% reduction)
- After ChaCha20 encryption: 240-840 bytes (+40 bytes overhead)
System Impact
CPU usage (36K events @ 364/sec):
- Without crypto: ~45% CPU
- With crypto: ~54% CPU
- Overhead: +9% (acceptable for production)
Memory footprint:
- Crypto library: ~2 MB (libsodium)
- Key storage: 64 bytes (32-byte key + metadata)
- Buffer pools: ~1 MB (reusable compression buffers)
- Total overhead: ~3 MB
Troubleshooting
Decryption Failures
Symptom: crypto_errors > 0 in metrics
Possible causes:
- Key mismatch - Producer and consumer using different keys
- Corrupted data - Network transmission errors (rare with TCP)
- Wrong algorithm - Configuration mismatch (ChaCha20 vs AES)
Diagnosis:
# Check etcd keys match
etcdctl get /crypto/ml-detector/tokens
etcdctl get /crypto/firewall/tokens
# Enable debug logging
vim firewall-acl-agent/config/firewall.json
# Set: "logging": { "level": "debug" }
# Restart and check logs
tail -f /vagrant/logs/lab/firewall-agent.log | grep CRYPTO
Decompression Failures
Symptom: decompression_errors > 0 in metrics
Possible causes:
- Missing size header - Old format without 4-byte size prefix
- Corrupted data - Truncated or modified compressed data
- Wrong algorithm - LZ4 vs Zstd mismatch
Diagnosis:
# Verify size header presence (first 4 bytes)
hexdump -C /tmp/compressed_sample.bin | head -1
# Should see: 00 00 0X XX (big-endian size)
# Check compression algorithm config
grep -A5 '"compression"' */config/*.json
Symptom: High CPU usage or increased latency
Optimization steps:
- Disable compression for low-bandwidth scenarios (LAN)
- Increase batch sizes to amortize crypto overhead
- Use memory pools to reduce allocation overhead
- Profile with perf to identify bottlenecks
# Profile crypto operations
sudo perf record -g ./firewall-acl-agent -c config.json
sudo perf report
# Look for hotspots in crypto_transport::* functions
Security Audit Checklist