Understanding endpoints
An endpoint connects:- Dataset (optional) - Provides relevant context through vector search
- Model (optional) - Generates AI responses based on the context
- Policies (optional) - Controls access, rate limits, and costs
Search-only endpoints (raw)
Return matching documents from your dataset without AI generation:- Best for: Document retrieval, citation lookup, data exploration
- Requires: Dataset only
- Returns: Relevant document chunks with similarity scores
AI-only endpoints (summary)
Generate responses using the model without dataset context:- Best for: General Q&A, chatbots using model’s training data
- Requires: Model only
- Returns: AI-generated text responses
RAG endpoints (both)
Combine dataset search with AI generation for contextualized responses:- Best for: Q&A over your documents, knowledge bases, research assistants
- Requires: Dataset and model
- Returns: AI response with relevant source documents
Creating an endpoint
- Must be 3-64 characters
- Lowercase letters, numbers, and hyphens only
- No leading/trailing/consecutive hyphens
- Must be unique across your Space
- Only datasets with “running” status are available
- The dataset must be healthy (test connection first)
- Only models with successful health checks are available
- The model must have valid credentials
You must select at least one component (dataset or model). Endpoints without both components have limited functionality.
- Requires dataset
- Returns: List of matching documents with metadata
- Requires model
- Returns: Generated text response
- Requires both dataset and model
- Returns: Generated response with source documents
You can toggle the published status later. Use draft mode to test configurations before making the endpoint public.
Endpoint configuration examples
RAG endpoint for research papers
Document search endpoint
AI assistant endpoint
Testing endpoints locally
Before publishing, test your endpoint:- Similarity threshold - Minimum match score (0.0-1.0)
- Limit - Number of documents to retrieve (1-20)
- Max tokens - Maximum response length
- Temperature - Response randomness (0.0-2.0)
{
"references": {
"documents": [
{
"document_id": "doc1",
"content": "Machine learning is...",
"metadata": {
"file_name": "intro-to-ml.pdf",
"page_numbers": "1,2"
},
"similarity_score": 0.92
}
],
"cost": 0.001
}
}
{
"summary": {
"model": "gpt-4",
"message": {
"role": "assistant",
"content": "Machine learning is a subset of AI...",
"tokens": 45
},
"usage": {
"prompt_tokens": 20,
"completion_tokens": 45,
"total_tokens": 65
},
"cost": 0.0025
}
}
{
"summary": {
"model": "gpt-4",
"message": {
"role": "assistant",
"content": "Based on the documents, machine learning is...",
"tokens": 67
},
"usage": {
"prompt_tokens": 150,
"completion_tokens": 67,
"total_tokens": 217
},
"cost": 0.0085
},
"references": {
"documents": [
{
"document_id": "doc1",
"content": "Machine learning is...",
"similarity_score": 0.92
}
],
"cost": 0.001
}
}
Querying endpoints via API
Once published, query your endpoint programmatically:Authentication
All endpoint queries require authentication using a SyftHub satellite token:Query request
Endpoint:POST /api/v1/endpoints/{slug}/query
Headers:
messages- Conversation history (required)- Can be a string or array of message objects
- Each message has
role(user/assistant/system) andcontent
similarity_threshold- Minimum similarity for matches (0.0-1.0, default: 0.5)limit- Maximum documents to return (default: 5)include_metadata- Include document metadata (default: true)max_tokens- Maximum tokens to generate (default: 100)temperature- Response randomness (0.0-2.0, default: 0.7)stop_sequences- Strings that stop generation (default: [“\n”])stream- Stream response chunks (default: false)presence_penalty- Topic repetition penalty (-2.0 to 2.0, default: 0.0)frequency_penalty- Word repetition penalty (-2.0 to 2.0, default: 0.0)transaction_token- Optional token for accounting (JWT format)
cURL example
Python example
JavaScript example
Error handling
Endpoint queries can fail for several reasons:404 Not Found
Cause: Endpoint doesn’t exist or slug is incorrect401 Unauthorized
Cause: Missing or invalid authentication token403 Permission Denied
Cause: User doesn’t have access (blocked by policy)400 Bad Request
Cause: Invalid request parameters429 Rate Limited
Cause: Too many requests (rate limit policy)Updating endpoints
You can update certain endpoint properties:You cannot change the slug, dataset, model, or response type after creation. To change these, create a new endpoint.
Checking slug availability
Before creating an endpoint, verify the slug is available: Endpoint:POST /api/v1/endpoints/check-slug
Body:
Deleting endpoints
Deleting an endpoint removes it from Syft Space:Deleting an endpoint doesn’t delete the associated dataset or model. Those components can be reused in other endpoints.
Best practices
Naming conventions
- Use descriptive names - “Customer Support Q&A” not “Endpoint 1”
- Keep slugs short - “support-qa” not “customer-support-questions-and-answers”
- Use consistent tags - Choose a standard set of tags for your organization
Performance optimization
-
Tune similarity thresholds
- Start at 0.5 and adjust based on result quality
- Higher values (0.7+) = more precise but fewer results
- Lower values (0.3-0.5) = more results but less relevant
-
Limit document count
- More documents = more context but higher cost
- Typical range: 3-5 documents for most use cases
- Increase for complex queries requiring broad context
-
Set appropriate max tokens
- Short answers: 50-100 tokens
- Detailed explanations: 200-500 tokens
- Long-form content: 500-1000 tokens
Security considerations
- Use policies - Always add access control and rate limiting
- Monitor usage - Track query patterns for abuse
- Review responses - Ensure the endpoint doesn’t leak sensitive data
- Test thoroughly - Try adversarial queries before publishing
Next steps
Set policies
Add access control and rate limiting to your endpoints
Publish to SyftHub
Make your endpoint discoverable on the marketplace