Why Google Colab?
- Zero Setup: No installation or configuration needed
- Free GPU Access: Train models faster with free GPU runtime
- Easy Collaboration: Share notebooks with team members
- Cloud Storage: Use Google Drive for model parameter exchange
Architecture Overview
In this setup:- Each data owner runs a Colab notebook as an FL client
- The data scientist runs a Colab notebook as the FL server
- Google Drive syncs model parameters between participants
- No direct network connections required
File → Upload NotebookWe use
uv pip for faster installation in Colab environments. Regular pip install syft-flwr also works.import syft_client as sc
import syft_flwr
print(f"syft_client version: {sc.__version__}")
print(f"syft_flwr version: {syft_flwr.__version__}")
# Login as data scientist
ds_email = input("Enter the Data Scientist's email: ")
ds_client = sc.login_ds(email=ds_email)
# Add first data owner
do1_email = input("Enter the First Data Owner's email: ")
ds_client.add_peer(do1_email)
# Add second data owner
do2_email = input("Enter the Second Data Owner's email: ")
ds_client.add_peer(do2_email)
# Verify peers
ds_client.peers
You can add as many data owners as needed. Each will run their own Colab notebook as an FL client.
# Get DO1's datasets
do1_datasets = ds_client.datasets.get_all(datasite=do1_email)
print(f"DO1 has {len(do1_datasets)} dataset(s)")
# Inspect the first dataset
if do1_datasets:
do1_datasets[0].describe()
print(f"Mock data URL: {do1_datasets[0].mock_url}")
# Get DO2's datasets
do2_datasets = ds_client.datasets.get_all(datasite=do2_email)
print(f"DO2 has {len(do2_datasets)} dataset(s)")
if do2_datasets:
do2_datasets[0].describe()
print(f"Mock data URL: {do2_datasets[0].mock_url}")
You can access mock (synthetic) data for development and testing, but not the private data—that stays on the data owner’s machine.
from pathlib import Path
# Download from GitHub
!mkdir -p /content/fl-diabetes-prediction
!curl -sL https://github.com/khoaguin/fl-diabetes-prediction/archive/refs/heads/main.tar.gz | tar -xz --strip-components=1 -C /content/fl-diabetes-prediction
SYFT_FLWR_PROJECT_PATH = Path("/content/fl-diabetes-prediction")
print(f"Project downloaded to: {SYFT_FLWR_PROJECT_PATH}")
import syft_flwr
# Remove existing main.py if present
!rm -rf {SYFT_FLWR_PROJECT_PATH / "main.py"}
# Bootstrap the project
do_emails = [peer.email for peer in ds_client.peers]
syft_flwr.bootstrap(
SYFT_FLWR_PROJECT_PATH,
aggregator=ds_email,
datasites=do_emails,
transport="p2p" # Use P2P transport for Google Drive
)
print("✅ Bootstrapped project successfully")
The
transport="p2p" parameter tells Syft-Flwr to use Google Drive for communication instead of local SyftBox.[tool.syft_flwr]
app_name = "[email protected]_fl-diabetes-prediction_1234567890"
datasites = ["[email protected]", "[email protected]"]
aggregator = "[email protected]"
transport = "p2p"
# Clean up before submitting
!rm -rf {SYFT_FLWR_PROJECT_PATH / "fl_diabetes_prediction" / "__pycache__"}
job_name = "fl-diabetes-training"
# Submit to first data owner
ds_client.submit_python_job(
user=do1_email,
code_path=str(SYFT_FLWR_PROJECT_PATH),
job_name=job_name,
)
print(f"✅ Submitted job to {do1_email}")
# Submit to second data owner
ds_client.submit_python_job(
user=do2_email,
code_path=str(SYFT_FLWR_PROJECT_PATH),
job_name=job_name,
)
print(f"✅ Submitted job to {do2_email}")
# Check job status
ds_client.jobs
!uv pip install \
"flwr-datasets>=0.5.0" \
"imblearn>=0.0" \
"loguru>=0.7.3" \
"pandas>=2.3.0" \
"scikit-learn==1.6.1" \
"torch>=2.8.0" \
"ray==2.31.0"
import os
# Verify files exist
assert SYFT_FLWR_PROJECT_PATH.exists(), "Project path does not exist"
assert (SYFT_FLWR_PROJECT_PATH / "main.py").exists(), "main.py not found"
# Set environment variables
ds_email = ds_client.email
syftbox_folder = f"/content/SyftBox_{ds_email}"
# Run the FL server
!SYFTBOX_EMAIL="{ds_email}" SYFTBOX_FOLDER="{syftbox_folder}" \
uv run {str(SYFT_FLWR_PROJECT_PATH / "main.py")}
The training happens asynchronously through Google Drive. Clients and server don’t need to run simultaneously—they communicate by reading/writing files to Drive.
# View current jobs
ds_client.jobs
# Monitor output
print("Training in progress...")
print("Check the cell output above for live logs")
🚀 SERVER FUNCTION STARTED
⚙️ CONFIGURING STRATEGY
Strategy: FedAvgWithModelSaving
Min available clients: 2
Number of rounds: 3
📊 AGGREGATING METRICS
Number of clients: 2
✅ AGGREGATION COMPLETE - Average Accuracy: 0.7543
🔐 Checkpoint saved to: weights/parameters_round_1.safetensors
import os
# List saved model weights
weights_dir = Path(syftbox_folder) / "rds" / "weights"
if weights_dir.exists():
weights_files = list(weights_dir.glob("*.safetensors"))
print(f"Found {len(weights_files)} model checkpoints:")
for f in sorted(weights_files):
print(f" - {f.name}")
else:
print("No weights directory found yet")
from safetensors.numpy import load_file
import torch
# Load the final model
final_weights = load_file(str(weights_dir / "parameters_round_3.safetensors"))
# Load into your model
from fl_diabetes_prediction.task import Net
model = Net()
# Apply the weights...
Data Owner Setup
Data owners also run Colab notebooks. Here’s their workflow:!uv pip install -q syft-flwr
import syft_client as sc
# Login as data owner
do_email = input("Enter your email: ")
do_client = sc.login_do(email=do_email)
# Upload your private dataset
from google.colab import files
uploaded = files.upload() # Upload train.csv and test.csv
# Register with syft_client
do_client.datasets.create(
name="pima-indians-diabetes-database",
private_path="./private_data",
mock_path="./mock_data"
)
# View pending jobs
do_client.jobs
# Review a specific job
job = do_client.jobs[0]
print(f"Job from: {job.requester}")
print(f"Code: {job.code_preview}")
# Approve the job
do_client.job.approve(job)
Communication Flow
Here’s how parameters flow through Google Drive:- Server → Drive: Server writes initial model to
flwr/{app_name}/server/messages/ - Drive → Clients: Clients read model from their synced Drive folder
- Clients → Drive: Clients write local updates to
flwr/{app_name}/client_{id}/messages/ - Drive → Server: Server reads updates and aggregates
- Repeat: Process continues for configured number of rounds
Best Practices
Use Mock Data First
Test your FL project with mock data before submitting to data owners with private data.
Monitor Drive Quota
Large models can consume Drive storage quickly. Clean up old runs regularly.
Set Timeouts
Use reasonable message timeouts since Drive sync isn’t instant.
Save Checkpoints
Always use
FedAvgWithModelSaving to checkpoint progress in case of interruptions.Troubleshooting
Drive sync is slow
Drive sync is slow
Google Drive sync can take 30-60 seconds. Increase the timeout:
'syft_client' not found error
'syft_client' not found error
Make sure you installed syft-flwr correctly:
Permission denied on Drive
Permission denied on Drive
Ensure all participants have granted Drive access to the syft_client app.
Clients not connecting
Clients not connecting
Verify:
- All participants used the same
app_namefrom bootstrap - Transport is set to
"p2p"in all notebooks - Data owners have approved their jobs
Advantages of Colab-Based FL
- No Infrastructure: No need to set up servers or networking
- Accessible: Anyone with a Google account can participate
- Reproducible: Notebooks document the entire FL process
- Scalable: Add more data owners by sharing additional notebooks
Limitations
- Sync Latency: Google Drive sync adds 30-60s latency between rounds
- Storage Limits: Free Drive accounts have 15GB storage limits
- Session Timeouts: Colab sessions timeout after 12 hours of inactivity
- No Encryption: P2P transport doesn’t include end-to-end encryption
What’s Next?
- Learn about setting up local SyftBox for lower latency
- Implement custom aggregation strategies
- Try the complete Colab example