Security

Syft Space implements multiple layers of security to protect your data and control access to your endpoints.

Authentication

JWT token-based authentication

Local authentication uses JSON Web Tokens:

# Login flow
POST /api/v1/auth/login
{
  "email": "[email protected]",
  "password": "secure-password"
}

# Response includes JWT token
{
  "access_token": "eyJhbGciOiJIUzI1NiIs...",
  "token_type": "bearer"
}

Token verification:

Tokens are signed with a secret key
Include user email and tenant name in claims
Expire after configured time period
Validated on every request

SyftHub satellite tokens

For marketplace integration:

Authorization: Bearer <satellite-token>

Satellite tokens:

Issued by SyftHub marketplace
Contain user identity and permissions
Validated via public key cryptography
Short-lived with automatic refresh

Authorization

Tenant isolation

All data is scoped to tenants:

class BaseEntity(SQLModel):
    tenant_name: str  # Mandatory for all entities

Middleware enforcement:

class TenantMiddleware:
    async def __call__(self, request, call_next):
        tenant = extract_tenant(request)
        request.state.tenant = tenant
        return await call_next(request)

Users can only access data from their own tenant.

Policy-based access control

Endpoints can have access policies: Allow mode (whitelist):

{
  "policy_type": "access",
  "configuration": {
    "mode": "allow",
    "patterns": [
      "*@company.com",
      "[email protected]"
    ]
  }
}

Deny mode (blacklist):

{
  "policy_type": "access",
  "configuration": {
    "mode": "deny",
    "patterns": [
      "*@competitor.com"
    ]
  }
}

Role-based access control

Future enhancement for admin roles:

Owner - Full control
Admin - Manage endpoints and policies
User - Query endpoints only
Viewer - Read-only access

Data protection

Data at rest

SQLite database:

Stored in ~/.syft-space/app.db
File-level encryption supported via OS
Regular backups recommended

Vector databases:

Data persisted in Docker volumes
Isolated per tenant
Cleanup on dataset deletion

Data in transit

HTTPS/TLS:

server {
    listen 443 ssl;
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    
    location / {
        proxy_pass http://localhost:8080;
    }
}

Recommendations:

Use TLS 1.2 or higher
Use strong cipher suites
Enable HSTS headers
Use valid certificates (Let’s Encrypt)

Secrets management

API keys and tokens:

# Stored encrypted in database
class Model(BaseEntity):
    api_key: str  # Encrypted before storage

Environment variables:

# Never commit .env files
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Recommendations:

Use secret management tools (Vault, AWS Secrets Manager)
Rotate API keys regularly
Use separate keys per environment
Never log sensitive values

Input validation

Pydantic schemas

All inputs validated:

class CreateEndpointRequest(BaseModel):
    name: str = Field(min_length=1, max_length=255)
    slug: str = Field(pattern=r'^[a-z0-9-]{3,64}$')
    dataset_id: UUID
    model_id: UUID

Prevents:

SQL injection (via SQLModel ORM)
XSS attacks (via input sanitization)
Buffer overflows (via length limits)
Invalid data types

Rate limiting

Protect against abuse:

{
  "policy_type": "rate_limit",
  "configuration": {
    "requests_per_period": 100,
    "period_seconds": 3600,
    "scope": "per_user"
  }
}

Global rate limiting:

# In middleware
rate_limiter = RateLimiter(
    requests_per_minute=1000,
    burst_size=100
)

Network security

CORS configuration

app.add_middleware(
    CORSMiddleware,
    allow_origins=[
        "http://localhost:5173",  # Development
        "https://yourdomain.com"  # Production
    ],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

Never use allow_origins=["*"] in production.

Firewall rules

Recommended iptables configuration:

# Allow only necessary ports
iptables -A INPUT -p tcp --dport 8080 -j ACCEPT  # API
iptables -A INPUT -p tcp --dport 443 -j ACCEPT   # HTTPS
iptables -A INPUT -p tcp --dport 22 -j ACCEPT    # SSH
iptables -A INPUT -j DROP  # Block all other incoming

Docker socket security

Mounting /var/run/docker.sock gives container root access to host. Only use on trusted systems.

Alternatives:

Docker-in-Docker (DinD)
Rootless Docker
Pre-provisioned databases (no dynamic provisioning)

Audit logging

Track all security-relevant events:

logger.info(
    "endpoint_queried",
    endpoint=endpoint.slug,
    user=user_email,
    ip=request.client.host,
    tokens=usage.total_tokens,
    cost=usage.cost
)

Logged events:

Authentication attempts
Authorization failures
Endpoint queries
Policy violations
Configuration changes

Compliance

Right to access - Users can export their data
Right to erasure - Users can delete their account
Data minimization - Only collect necessary data
Purpose limitation - Data used only as specified

Data retention

# Configure retention policies
RETENTION_DAYS = 90

# Periodic cleanup
async def cleanup_old_logs():
    cutoff = datetime.now() - timedelta(days=RETENTION_DAYS)
    await db.execute(
        delete(AuditLog).where(AuditLog.created_at < cutoff)
    )

Security best practices

Production deployment

Use HTTPS with valid certificates
Enable firewall and restrict ports
Use strong passwords and 2FA
Keep software updated
Regular security audits

API keys and secrets

Store in environment variables or secret manager
Rotate regularly (every 90 days)
Use separate keys per environment
Never commit to version control
Revoke compromised keys immediately

Access control

Implement least privilege principle
Use access policies on all endpoints
Enable rate limiting
Monitor for suspicious activity
Regular access reviews

Data protection

Encrypt data in transit (HTTPS/TLS)
Encrypt sensitive data at rest
Regular backups to secure location
Test restore procedures
Secure backup storage

Vulnerability reporting

If you discover a security vulnerability:

Do not create public GitHub issue

Security issues should be reported privately.

Email security team

Send details to [email protected]

Provide details

Include:

Description of vulnerability
Steps to reproduce
Potential impact
Suggested fix (if any)

Wait for response

Team will acknowledge within 48 hours and provide timeline for fix.

Responsible disclosure is appreciated. We aim to fix critical vulnerabilities within 30 days.

Get Started

Core Concepts

Guides

Desktop App

Deployment

Advanced

Authentication

JWT token-based authentication

SyftHub satellite tokens

Authorization

Tenant isolation

Policy-based access control

Role-based access control

Data protection

Data at rest

Data in transit

Secrets management

Input validation

Pydantic schemas

Rate limiting

Network security

CORS configuration

Firewall rules

Docker socket security

Audit logging

Compliance

Data retention

Security best practices

Vulnerability reporting

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Desktop App

Deployment

Advanced

​Authentication

​JWT token-based authentication

​SyftHub satellite tokens

​Authorization

​Tenant isolation

​Policy-based access control

​Role-based access control

​Data protection

​Data at rest

​Data in transit

​Secrets management

​Input validation

​Pydantic schemas

​Rate limiting

​Network security

​CORS configuration

​Firewall rules

​Docker socket security

​Audit logging

​Compliance

​GDPR compliance

​Data retention

​Security best practices

​Vulnerability reporting

Build docs developers (and LLMs) love

Authentication

JWT token-based authentication

SyftHub satellite tokens

Authorization

Tenant isolation

Policy-based access control

Role-based access control

Data protection

Data at rest

Data in transit

Secrets management

Input validation

Pydantic schemas

Rate limiting

Network security

CORS configuration

Firewall rules

Docker socket security

Audit logging

Compliance

GDPR compliance

Data retention

Security best practices

Vulnerability reporting