Core components
Syft Space is built around four primary concepts that work together to create intelligent, policy-controlled endpoints:Datasets
Store and search your private data using vector databases like Weaviate or ChromaDB
Models
Connect to AI models (OpenAI, local vLLM) to generate intelligent responses
Endpoints
Combine datasets and models into queryable API endpoints
Policies
Apply rate limiting and access controls to protect your endpoints
How components relate
Each endpoint can connect to:- One dataset (optional) - for retrieving relevant context
- One model (optional) - for generating responses
- Multiple policies - for access control and rate limiting
At least one of dataset or model must be configured for an endpoint to be valid.
Response types
Endpoints support three response configurations based on which components are connected:Raw (dataset only)
Raw (dataset only)
Returns search results from the dataset without model processing. Useful for document retrieval systems.Configuration:
response_type: "raw"Summary (model only)
Summary (model only)
Returns generated text from the model without dataset context. Useful for standalone chat interfaces.Configuration:
response_type: "summary"Both (dataset + model)
Both (dataset + model)
Combines dataset search with model generation - the full RAG pipeline. Search results are injected as context for the model.Configuration:
response_type: "both"Multi-tenancy
All components are tenant-isolated:- Each tenant has their own datasets, models, endpoints, and policies
- Resources cannot be shared across tenants
- All API operations are scoped to the authenticated tenant
Type system
Each component uses a type registry pattern:- Dataset types:
weaviate,chromadb_local- define how data is stored and searched - Model types:
openai,vllm_local- define how AI models are accessed - Policy types:
rate_limit,accounting_guard- define access control rules
- Configuration schemas (what fields are required)
- Validation logic (ensuring configurations are valid)
- Runtime behavior (how to execute searches, chats, or policy checks)
Data flow
When a client queries an endpoint:- Pre-hook policies execute (e.g., rate limit check)
- Dataset search runs (if configured) to find relevant documents
- Model chat executes (if configured) with search results as context
- Post-hook policies execute (e.g., accounting transaction)
- Response returned to client
Policy hooks can block requests at any stage by raising a
PolicyViolationError.Next steps
Datasets
Learn about dataset types, provisioners, and data ingestion
Models
Understand model types and chat interfaces