Overview
This guide covers production deployment of S2 Lite using Kubernetes and Helm, including TLS, monitoring, high availability considerations, and security best practices.
Prerequisites
Kubernetes cluster (1.19+)
Helm 3.0+
kubectl configured
S3-compatible object storage bucket
(Optional) Prometheus Operator for metrics
Quick Start
Install from Helm Repository
Add the S2 Helm repository
helm repo add s2 https://s2-streamstore.github.io/s2
helm repo update
Install with default settings (in-memory)
# For testing only - data not persisted
helm install my-s2-lite s2/s2-lite-helm
Install with S3 storage (production)
helm install my-s2-lite s2/s2-lite-helm \
--set objectStorage.enabled= true \
--set objectStorage.bucket=my-s2-bucket \
--set objectStorage.path=s2lite
Install from OCI Registry (GitHub Container Registry)
# Install directly from GHCR
helm install my-s2-lite oci://ghcr.io/s2-streamstore/charts/s2-lite-helm
# Or with custom values
helm install my-s2-lite oci://ghcr.io/s2-streamstore/charts/s2-lite-helm \
--set objectStorage.enabled= true \
--set objectStorage.bucket=my-s2-bucket
Production Configuration
Complete values.yaml Example
Create a values.yaml file for your production deployment:
# Production values.yaml for S2 Lite
# Number of replicas (Note: S2 Lite is currently single-node)
replicaCount : 1
image :
repository : ghcr.io/s2-streamstore/s2
pullPolicy : IfNotPresent
tag : "0.29.17" # Pin to specific version in production
# Object Storage Configuration
objectStorage :
enabled : true
bucket : production-s2-bucket
path : s2lite
# Leave empty for AWS S3, or set for other providers:
# endpoint: https://fly.storage.tigris.dev
# Service Configuration
service :
type : LoadBalancer
port : 443 # HTTPS
annotations :
# AWS Network Load Balancer
service.beta.kubernetes.io/aws-load-balancer-type : "nlb"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled : "true"
# External DNS (optional)
external-dns.alpha.kubernetes.io/hostname : "s2.example.com"
# TLS Configuration
tls :
enabled : true
# Option 1: Self-signed (for testing)
selfSigned : false
# Option 2: Provided certificate (production)
cert : /etc/tls/tls.crt
key : /etc/tls/tls.key
# Mount TLS certificates from Kubernetes secret
volumes :
- name : tls-certs
secret :
secretName : s2-lite-tls
volumeMounts :
- name : tls-certs
mountPath : /etc/tls
readOnly : true
# Service Account (for IRSA/Workload Identity)
serviceAccount :
create : true
annotations :
# AWS IRSA
eks.amazonaws.com/role-arn : arn:aws:iam::123456789012:role/s2-lite-role
# GCP Workload Identity
# iam.gke.io/gcp-service-account: [email protected]
# Resource Limits
resources :
requests :
cpu : 500m
memory : 512Mi
limits :
cpu : 2000m
memory : 2Gi
# Health Checks
livenessProbe :
httpGet :
path : /health
port : http
initialDelaySeconds : 10
periodSeconds : 10
timeoutSeconds : 5
failureThreshold : 3
readinessProbe :
httpGet :
path : /health
port : http
initialDelaySeconds : 5
periodSeconds : 5
timeoutSeconds : 3
failureThreshold : 3
startupProbe :
httpGet :
path : /health
port : http
initialDelaySeconds : 5
periodSeconds : 10
timeoutSeconds : 5
failureThreshold : 60 # Allow up to 10 minutes for startup
# Prometheus Monitoring
metrics :
enabled : true
serviceMonitor :
enabled : true
interval : 30s
scrapeTimeout : 10s
labels :
prometheus : kube-prometheus
# Pod Disruption Budget
podDisruptionBudget :
enabled : true
maxUnavailable : 1
# Environment Variables
env :
- name : SL8_FLUSH_INTERVAL
value : "50ms"
- name : SL8_MANIFEST_POLL_INTERVAL
value : "5s"
# Enable pipelining (experimental)
# - name: S2LITE_PIPELINE
# value: "true"
# Node Selection (optional)
nodeSelector :
workload : streaming
# Tolerations (optional)
tolerations :
- key : "workload"
operator : "Equal"
value : "streaming"
effect : "NoSchedule"
# Affinity (optional)
affinity :
podAntiAffinity :
preferredDuringSchedulingIgnoredDuringExecution :
- weight : 100
podAffinityTerm :
labelSelector :
matchLabels :
app.kubernetes.io/name : s2-lite
topologyKey : kubernetes.io/hostname
Deploy with:
helm install s2-lite s2/s2-lite-helm -f values.yaml -n s2-system --create-namespace
TLS Configuration
Option 1: Self-Signed Certificate (Testing)
tls :
enabled : true
selfSigned : true
helm install my-s2-lite s2/s2-lite-helm \
--set tls.enabled= true \
--set tls.selfSigned= true
# Configure CLI to trust self-signed cert
s2 config set ssl_no_verify true
Self-signed certificates should only be used for testing. Use proper certificates in production.
Option 2: Provided Certificate (Production)
Create TLS secret
kubectl create secret tls s2-lite-tls \
--cert=tls.crt \
--key=tls.key \
-n s2-system
Configure Helm values
tls :
enabled : true
cert : /etc/tls/tls.crt
key : /etc/tls/tls.key
volumes :
- name : tls-certs
secret :
secretName : s2-lite-tls
volumeMounts :
- name : tls-certs
mountPath : /etc/tls
readOnly : true
Deploy
helm install my-s2-lite s2/s2-lite-helm -f values.yaml -n s2-system
Option 3: cert-manager Integration
Install cert-manager
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set installCRDs= true
Create ClusterIssuer
apiVersion : cert-manager.io/v1
kind : ClusterIssuer
metadata :
name : letsencrypt-prod
spec :
acme :
server : https://acme-v02.api.letsencrypt.org/directory
email : [email protected]
privateKeySecretRef :
name : letsencrypt-prod
solvers :
- http01 :
ingress :
class : nginx
Create Certificate
apiVersion : cert-manager.io/v1
kind : Certificate
metadata :
name : s2-lite-tls
namespace : s2-system
spec :
secretName : s2-lite-tls
issuerRef :
name : letsencrypt-prod
kind : ClusterIssuer
dnsNames :
- s2.example.com
Cloud Provider Examples
AWS EKS with IRSA
Create IAM policy
{
"Version" : "2012-10-17" ,
"Statement" : [
{
"Effect" : "Allow" ,
"Action" : [
"s3:GetObject" ,
"s3:PutObject" ,
"s3:DeleteObject" ,
"s3:ListBucket"
],
"Resource" : [
"arn:aws:s3:::my-s2-bucket" ,
"arn:aws:s3:::my-s2-bucket/*"
]
}
]
}
Create IAM role with OIDC
eksctl create iamserviceaccount \
--name s2-lite \
--namespace s2-system \
--cluster my-cluster \
--region us-east-1 \
--attach-policy-arn arn:aws:iam::123456789012:policy/S2LiteS3Policy \
--approve
Deploy with IRSA annotations
objectStorage :
enabled : true
bucket : my-s2-bucket
serviceAccount :
create : true
annotations :
eks.amazonaws.com/role-arn : arn:aws:iam::123456789012:role/eksctl-my-cluster-addon-iamserviceaccount-Role
service :
type : LoadBalancer
annotations :
service.beta.kubernetes.io/aws-load-balancer-type : nlb
external-dns.alpha.kubernetes.io/hostname : s2.example.com
GCP GKE with Workload Identity
Create GCP service account
gcloud iam service-accounts create s2-lite \
--display-name= "S2 Lite Service Account"
Grant GCS permissions
gcloud projects add-iam-policy-binding PROJECT_ID \
--member= "serviceAccount:s2-lite@PROJECT_ID.iam.gserviceaccount.com" \
--role= "roles/storage.objectAdmin"
Bind Kubernetes SA to GCP SA
gcloud iam service-accounts add-iam-policy-binding \
s2-lite@PROJECT_ID.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[s2-system/s2-lite]"
Deploy with Workload Identity
serviceAccount :
create : true
annotations :
iam.gke.io/gcp-service-account : s2-lite@PROJECT_ID.iam.gserviceaccount.com
objectStorage :
enabled : true
bucket : gs://my-s2-bucket
Azure AKS with Managed Identity
Create managed identity
az identity create \
--name s2-lite-identity \
--resource-group my-rg \
--location eastus
Grant storage permissions
az role assignment create \
--role "Storage Blob Data Contributor" \
--assignee $( az identity show --name s2-lite-identity --resource-group my-rg --query principalId -o tsv ) \
--scope /subscriptions/SUBSCRIPTION_ID/resourceGroups/my-rg/providers/Microsoft.Storage/storageAccounts/myaccount
Deploy with pod identity
serviceAccount :
create : true
annotations :
azure.workload.identity/client-id : CLIENT_ID
objectStorage :
enabled : true
bucket : my-container
endpoint : https://myaccount.blob.core.windows.net
Monitoring & Observability
Prometheus Integration
With Prometheus Operator installed:
metrics :
enabled : true
serviceMonitor :
enabled : true
interval : 30s
scrapeTimeout : 10s
labels :
prometheus : kube-prometheus
Key Metrics to Monitor
s2_lite_append_duration_seconds - Append latency histogram
s2_lite_read_duration_seconds - Read latency histogram
s2_lite_active_streams - Number of active streams
s2_lite_active_sessions - Number of active client sessions
slatedb_* - SlateDB internal metrics
Grafana Dashboard
Query the /metrics endpoint to build dashboards:
kubectl port-forward svc/my-s2-lite 8080:80 -n s2-system
curl http://localhost:8080/metrics
High Availability Considerations
Important : S2 Lite is currently a single-node deployment. The Recreate deployment strategy ensures only one instance writes to the object store at a time, preventing data corruption.
Current Architecture
Single Active Instance : Only one S2 Lite pod can be active at a time
Recreate Strategy : Old pod terminates before new pod starts
Fencing : On startup, S2 Lite waits one manifest poll interval to ensure previous instance is fenced
Achieving High Availability
Fast Recovery : Minimize downtime during pod restarts
resources :
requests :
cpu : 1000m # Faster startup
startupProbe :
initialDelaySeconds : 5
periodSeconds : 5 # Faster detection
Multi-Region Deployments : Run separate S2 Lite instances in different regions with different buckets
Client-Side Retry : Configure SDKs with retry logic and failover
Planned Multi-Node Support
Future versions may support horizontal scaling. Track progress:
Resource Initialization
Declarative Basin/Stream Creation
Create basins and streams automatically on startup:
Create init spec file
{
"basins" : [
{
"name" : "production" ,
"config" : {
"create_stream_on_append" : true ,
"create_stream_on_read" : false ,
"default_stream_config" : {
"storage_class" : "standard" ,
"retention_policy" : "7days" ,
"timestamping" : {
"mode" : "client-prefer" ,
"uncapped" : false
},
"delete_on_empty" : {
"min_age" : "1day"
}
}
},
"streams" : [
{
"name" : "events" ,
"config" : {
"retention_policy" : "infinite"
}
},
{
"name" : "logs" ,
"config" : {
"retention_policy" : "3days"
}
}
]
}
]
}
Create ConfigMap
kubectl create configmap s2-lite-init \
--from-file=init.json \
-n s2-system
Mount in Helm values
env :
- name : S2LITE_INIT_FILE
value : /etc/s2/init.json
volumeMounts :
- name : init-config
mountPath : /etc/s2
volumes :
- name : init-config
configMap :
name : s2-lite-init
Security Best Practices
Pod Security
The default Helm chart includes security hardening:
podSecurityContext :
runAsNonRoot : true
runAsUser : 65532 # nonroot user
runAsGroup : 65532
fsGroup : 65532
seccompProfile :
type : RuntimeDefault
securityContext :
allowPrivilegeEscalation : false
capabilities :
drop :
- ALL
readOnlyRootFilesystem : true
Network Policies
Restrict network access:
apiVersion : networking.k8s.io/v1
kind : NetworkPolicy
metadata :
name : s2-lite-policy
namespace : s2-system
spec :
podSelector :
matchLabels :
app.kubernetes.io/name : s2-lite
policyTypes :
- Ingress
- Egress
ingress :
- from :
- namespaceSelector :
matchLabels :
name : application
ports :
- protocol : TCP
port : 443
egress :
- to :
- namespaceSelector : {}
ports :
- protocol : TCP
port : 443 # S3 API
- to :
- namespaceSelector :
matchLabels :
name : kube-system
ports :
- protocol : TCP
port : 53 # DNS
CORS Configuration
Disable permissive CORS in production:
env :
- name : S2LITE_NO_CORS
value : "true"
Or use --no-cors flag.
Upgrading
Upgrade release
helm upgrade my-s2-lite s2/s2-lite-helm \
-f values.yaml \
-n s2-system
Verify deployment
kubectl rollout status deployment/my-s2-lite -n s2-system
kubectl get pods -n s2-system
Pinning Versions
Pin to specific chart and app versions in production:
helm install my-s2-lite s2/s2-lite-helm \
--version 0.1.8 \
--set image.tag= 0.29.17 \
-f values.yaml
Troubleshooting
Check Logs
kubectl logs -f deployment/my-s2-lite -n s2-system
Common Issues
Check events: kubectl describe pod -l app.kubernetes.io/name=s2-lite -n s2-system
Common causes:
Insufficient resources
Node selector mismatch
Missing service account
Check object storage permissions: kubectl logs deployment/my-s2-lite -n s2-system --previous
Verify:
Bucket exists
IAM role has correct permissions
Endpoint URL is correct
Increase startup time: startupProbe :
initialDelaySeconds : 30
failureThreshold : 60
Next Steps
Backup & Restore Learn backup strategies for disaster recovery
S3 Setup Configure different object storage providers