Privacy Principles
Syft-Flwr is designed with privacy as a core principle:
Data Minimization : Only model updates leave participant devices
Local Computation : Training happens entirely on local data
Optional Encryption : End-to-end encryption for SyftBox transport
Transparency : File-based audit trails for all communications
Participant Control : Data owners control access and participation
Data Flow Architecture
What Leaves the Device
Transmitted :
Model parameters (weights, biases)
Training metrics (loss, accuracy)
Dataset size (number of samples)
Metadata (round number, client ID)
Never Transmitted :
Raw training data
Individual data points
Feature values
Labels or annotations
Dataset structure details
Example Data Flow
# On Client Device
class MyClient ( NumPyClient ):
def fit ( self , parameters , config ):
# 1. Receive global model parameters
model.set_weights(parameters)
# 2. Train on LOCAL data (never transmitted)
model.fit(
x_train, # STAYS ON DEVICE
y_train, # STAYS ON DEVICE
epochs = 5
)
# 3. Return ONLY model updates
return (
model.get_weights(), # Transmitted: model parameters
len (x_train), # Transmitted: dataset size
{ "loss" : 0.5 } # Transmitted: metrics
)
Location: Example pattern from Flower documentation
Encryption Model
SyftBox Transport: X3DH Protocol
SyftBox transport supports optional X3DH (Extended Triple Diffie-Hellman) encryption:
Enabling Encryption
Bootstrap with encryption (default):
from syft_flwr import bootstrap
bootstrap(
flwr_project_dir = "./project" ,
aggregator = "[email protected] " ,
datasites = [ "[email protected] " ],
transport = "syftbox"
)
# Encryption enabled by default
# Keys auto-generated on first run
Disable encryption (development only):
export SYFT_FLWR_ENCRYPTION_ENABLED = false
python main.py
Key Management
Keys are stored in datasites:
~/.syftbox/datasites/{email}/
├── public/
│ └── x3dh/
│ ├── identity_key.pub # Public identity key
│ ├── signed_prekey.pub # Public signed prekey
│ └── onetime_prekeys/ # Public one-time prekeys
│ ├── prekey_001.pub
│ └── prekey_002.pub
│
└── private/
└── x3dh/
├── identity_key.priv # Private identity key
├── signed_prekey.priv # Private signed prekey
└── onetime_prekeys/ # Private one-time prekeys
├── prekey_001.priv
└── prekey_002.priv
Key Bootstrap Process :
# src/syft_flwr/utils.py
from syft_crypto.x3dh_bootstrap import ensure_bootstrap
def setup_client ( app_name : str , project_dir : Optional[Path] = None ):
client = create_client( project_dir = project_dir)
encryption_enabled = os.environ.get( "SYFT_FLWR_ENCRYPTION_ENABLED" , "true" ).lower() != "false"
if encryption_enabled and not isinstance (client, SyftP2PClient):
# Generate keys if they don't exist
client = ensure_bootstrap(client)
logger.info( "🔐 End-to-end encryption is ENABLED" )
return client, encryption_enabled, f "flwr/ { app_name } "
Location: src/syft_flwr/utils.py:67
Encryption in Practice
Server encrypts request :
# src/syft_flwr/fl_orchestrator/syft_grid.py
def _send_encrypted_message ( self , msg_bytes : bytes , dest_datasite : str , msg : Message):
# Base64 encode for encrypted transmission
encoded_body = base64.b64encode(msg_bytes).decode( "utf-8" )
# Send with encrypt=True
future_id = self ._rpc.send(
to_email = dest_datasite,
app_name = self .app_name,
endpoint = "messages" ,
body = encoded_body.encode( "utf-8" ),
encrypt = True # Triggers X3DH encryption
)
logger.debug( f "🔐 Pushed ENCRYPTED message to { dest_datasite } " )
return future_id
Location: src/syft_flwr/fl_orchestrator/syft_grid.py:396
Client decrypts request :
# src/syft_flwr/fl_orchestrator/flower_client.py
class RequestProcessor :
def decode_request_body ( self , request_body : Union[ bytes , str ]) -> bytes :
if not self .message_handler.encryption_enabled:
return request_body
# Decode base64
if isinstance (request_body, bytes ):
request_body_str = request_body.decode( "utf-8" )
else :
request_body_str = request_body
decoded = base64.b64decode(request_body_str)
logger.debug( "🔓 Decoded base64 message" )
return decoded
Location: src/syft_flwr/fl_orchestrator/flower_client.py:74
Auto-decrypt : The syft_event library automatically handles X3DH decryption when auto_decrypt=True is passed to on_request().
P2P Transport Security
P2P transport does not use X3DH encryption:
# src/syft_flwr/rpc/p2p_file_rpc.py
def send ( self , to_email : str , app_name : str , endpoint : str ,
body : bytes , encrypt : bool = False ) -> str :
if encrypt:
logger.warning(
"Encryption not supported in P2PFileRpc, sending unencrypted"
)
# Write plaintext to Google Drive
# Relies on Google's transport security (HTTPS)
self ._gdrive_io.write_to_outbox( ... )
Location: src/syft_flwr/rpc/p2p_file_rpc.py:38
P2P Security Model :
Transport encryption : HTTPS between client and Google servers
Access control : Google Drive file permissions
At-rest encryption : Google’s server-side encryption
No end-to-end encryption : Google can theoretically access plaintext
Privacy Guarantees
What Syft-Flwr Protects
Threat SyftBox (Encrypted) SyftBox (Unencrypted) P2P Network eavesdropping ✅ X3DH encryption ❌ Plaintext ✅ HTTPS File system snooping ✅ Encrypted files ❌ Plaintext files N/A (cloud) Man-in-the-middle ✅ X3DH authentication ❌ No authentication ✅ TLS Server compromise ✅ Can’t read client data ❌ Can read messages ⚠️ Google has keys Client impersonation ✅ X3DH identity keys ❌ No verification ⚠️ OAuth-based Data leakage ✅ Only model updates ✅ Only model updates ✅ Only model updates
What Syft-Flwr Does NOT Protect
Model inversion attacks : Adversary reconstructs training data from model updates
# Syft-Flwr transmits model weights
weights = model.get_weights()
# Sophisticated adversary might infer training data characteristics
# Mitigation: Use differential privacy (not built into Syft-Flwr)
Membership inference : Adversary determines if a specific sample was in training set
Gradient leakage : Gradients can leak information about training data
For scenarios requiring differential privacy , combine Syft-Flwr with libraries like Opacus or use Flower’s built-in DP support.
Audit and Transparency
Message Audit Trail
Every message is a file with metadata:
# List all messages sent to a client
ls -lh ~/.syftbox/datasites/[email protected] /app_data/my_fl_app/rpc/messages/
# Output:
# -rw-r--r-- 1 user user 2.3M Mar 02 10:15 a1b2c3d4.request
# -rw-r--r-- 1 user user 2.4M Mar 02 10:16 a1b2c3d4.response
# -rw-r--r-- 1 user user 2.3M Mar 02 11:20 e5f6g7h8.request
# -rw-r--r-- 1 user user 2.4M Mar 02 11:21 e5f6g7h8.response
Inspecting Messages
Unencrypted messages (protobuf):
from pathlib import Path
from syft_flwr.serde import bytes_to_flower_message
# Read message file
msg_path = Path( "~/.syftbox/datasites/[email protected] /app_data/my_fl_app/rpc/messages/a1b2c3d4.request" ).expanduser()
msg_bytes = msg_path.read_bytes()
# Deserialize
message = bytes_to_flower_message(msg_bytes)
print ( f "Message type: { message.metadata.message_type } " )
print ( f "From node: { message.metadata.src_node_id } " )
print ( f "To node: { message.metadata.dst_node_id } " )
print ( f "Round: { message.metadata.group_id } " )
Encrypted messages :
from syft_crypto import EncryptedPayload, decrypt_message
from syft_core import Client
# Read encrypted message
msg_bytes = msg_path.read_bytes()
# Parse as encrypted payload
encrypted = EncryptedPayload.model_validate_json(msg_bytes.decode())
print ( f "Encrypted by: { encrypted.sender } " )
print ( f "Encrypted for: { encrypted.recipient } " )
print ( f "Key exchange: X3DH" )
# Decrypt (requires recipient's private key)
client = Client.load()
plaintext = decrypt_message(encrypted, client = client)
message = bytes_to_flower_message(plaintext)
Logging
Syft-Flwr logs all privacy-relevant operations:
# Encryption status is always logged
logger.info( "🔐 End-to-end encryption is ENABLED for FL messages" )
logger.warning( "⚠️ End-to-end encryption is DISABLED for FL messages" )
# Message transmission is logged
logger.debug( "🔐 Pushed ENCRYPTED message to [email protected] , size 2.3 MB" )
logger.debug( "📤 Pushed PLAINTEXT message to [email protected] , size 2.3 MB" )
# Decryption is logged
logger.debug( "🔓 Successfully decrypted message" )
logger.debug( "📥 Received PLAINTEXT message" )
Privacy Best Practices
1. Use Encryption in Production
# Development: encryption optional
export SYFT_FLWR_ENCRYPTION_ENABLED = false
# Production: encryption required
unset SYFT_FLWR_ENCRYPTION_ENABLED # Defaults to true
2. Rotate Keys Periodically
# Backup old keys
mv ~/.syftbox/datasites/[email protected] /private/x3dh ~/.syftbox/datasites/[email protected] /private/x3dh.backup
# Generate new keys on next run
python main.py # Auto-generates new keys
3. Minimize Data Exposure
class PrivacyAwareClient ( NumPyClient ):
def fit ( self , parameters , config ):
model.set_weights(parameters)
model.fit(x_train, y_train, epochs = 5 )
# Only return aggregated metrics
return (
model.get_weights(),
len (x_train), # OK: dataset size
{ "loss" : avg_loss} # OK: aggregated metric
# DON'T return: per-sample losses, gradients, embeddings
)
4. Validate Participants
5. Monitor for Anomalies
# Server-side validation
class SecureStrategy ( FedAvg ):
def aggregate_fit ( self , server_round , results , failures ):
# Check for suspiciously large updates
for client_proxy, fit_res in results:
weights = parameters_to_ndarrays(fit_res.parameters)
total_size = sum (w.nbytes for w in weights)
if total_size > 100_000_000 : # 100 MB
logger.warning( f "Client { client_proxy.cid } sent large update: { total_size / 1e6 :.1f} MB" )
return super ().aggregate_fit(server_round, results, failures)
6. Use Differential Privacy
Combine with Opacus for formal privacy guarantees:
from opacus import PrivacyEngine
class DPClient ( NumPyClient ):
def fit ( self , parameters , config ):
model.set_weights(parameters)
# Attach privacy engine
privacy_engine = PrivacyEngine()
model, optimizer, train_loader = privacy_engine.make_private(
module = model,
optimizer = optimizer,
data_loader = train_loader,
noise_multiplier = 1.1 ,
max_grad_norm = 1.0 ,
)
# Train with DP guarantees
model.fit(train_loader, epochs = 5 )
epsilon = privacy_engine.get_epsilon( delta = 1e-5 )
logger.info( f "Privacy budget: ε= { epsilon :.2f} " )
return model.get_weights(), len (train_loader.dataset), { "epsilon" : epsilon}
Compliance Considerations
GDPR Compliance
Syft-Flwr supports GDPR requirements:
Data minimization : Only model updates transmitted (Art. 5)
Purpose limitation : FL-specific communication (Art. 5)
Transparency : File-based audit logs (Art. 15)
Right to erasure : Delete local data anytime (Art. 17)
Data portability : Standard Flower format (Art. 20)
HIPAA Compliance
For healthcare data:
✅ Use SyftBox with encryption (satisfies encryption requirements)
✅ Implement access controls (validate participant emails)
✅ Maintain audit logs (file timestamps and metadata)
⚠️ P2P transport may not meet requirements (data passes through Google)
Security Checklist
Next Steps
Architecture Understand the technical implementation
File-Based Communication Learn how messages flow securely