Skip to main content
Publishers and content creators can use Syft Space to make their content searchable and queryable through AI while maintaining full control over access, attribution, and monetization.

Why Syft Space for publishers

Maintain control

Your content stays on your infrastructure. Users get answers, not raw files.

Enable discovery

Make your content AI-searchable and discoverable through SyftHub.

Monetize access

Set pricing for queries and track usage through built-in accounting.

Preserve attribution

Every query response can include source attribution and links back to original content.

Use cases

News archives

Make decades of journalism AI-searchable for researchers, historians, and the public. Example workflow:
  • Upload historical archives (PDFs, articles, scans)
  • Create searchable endpoints by topic, time period, or publication
  • Set access policies (free for researchers, paid for commercial use)
  • Users query the archive through natural language
Benefits:
  • Monetize historical content without full republishing
  • Enable research without copyright concerns
  • Preserve journalistic heritage

Magazine back catalogs

Turn your magazine archives into a knowledge assistant for subscribers. Example workflow:
  • Index all magazine issues and articles
  • Create subscriber-only query endpoints
  • Users ask questions and get answers citing specific articles
  • Track which topics drive the most queries
Benefits:
  • Add value to existing subscriptions
  • Reduce customer support by letting AI answer common questions
  • Understand reader interests through query analytics

Educational content

Make textbooks and educational materials queryable for students and educators. Example workflow:
  • Upload textbooks, study guides, and course materials
  • Create grade-level or subject-specific endpoints
  • Students query for explanations, examples, and practice problems
  • Track usage to improve content
Benefits:
  • Create AI tutors from existing content
  • License content to educational institutions
  • Provide better student support

Getting started

1

Install Syft Space

Choose your installation method based on your infrastructure:

Desktop app

Best for small publishers testing the platform

Docker

Recommended for production deployments

Cloud VM

For large-scale operations
See the installation guide for details.
2

Add your content

# Create a dataset
curl -X POST http://localhost:8080/api/v1/datasets/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "magazine-archive",
    "dtype": "local_file",
    "configuration": {
      "httpPort": 8081,
      "grpcPort": 50051,
      "collectionName": "MagazineArchive",
      "ingestionPath": "/path/to/magazine/pdfs"
    }
  }'
Place your PDFs, documents, and articles in the ingestion path. Syft Space automatically indexes them.
3

Create queryable endpoints

Combine your content with an AI model:
# First, add a model
curl -X POST http://localhost:8080/api/v1/models/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "gpt-4",
    "dtype": "openai",
    "configuration": {
      "api_key": "sk-your-key",
      "model": "gpt-4-turbo"
    }
  }'

# Create an endpoint
curl -X POST http://localhost:8080/api/v1/endpoints/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Magazine Archive Query",
    "slug": "magazine-archive",
    "dataset_id": "<dataset-id>",
    "model_id": "<model-id>",
    "response_type": "both"
  }'
4

Set access policies

Control who can access your content and how:
curl -X POST http://localhost:8080/api/v1/policies/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Standard Rate Limit",
    "dtype": "rate_limit",
    "configuration": {
      "limit": "100/day",
      "scope": "user"
    },
    "endpoint_id": "<endpoint-id>"
  }'
5

Publish to SyftHub

Make your endpoint discoverable:
  1. Register on SyftHub through the UI or API
  2. Publish your endpoint:
    curl -X POST http://localhost:8080/api/v1/endpoints/magazine-archive/publish \
      -H "Content-Type: application/json" \
      -d '{
        "visibility": "public",
        "description": "Query our complete magazine archive from 1980-2025"
      }'
    
  3. Your endpoint is now discoverable at syfthub.openmined.org

Best practices

Content organization

Create separate datasets for different topics or time periods:
  • sports-1990-2000 - Sports coverage from the 90s
  • politics-archive - All political journalism
  • lifestyle-recent - Recent lifestyle content
This allows different pricing and access policies for different content.
Add metadata to help AI provide better context:
  • Publication date
  • Author names
  • Article categories/tags
  • Original URL or reference
This metadata can be included in query responses.
Create different endpoints for different audiences:
  • archive-free - Limited free access for discovery
  • archive-premium - Full access for subscribers
  • archive-research - Academic pricing with higher limits

Monetization strategies

Freemium model

Offer limited free queries to drive discovery, then charge for unlimited access.

Subscription tiers

Different query limits and features for different subscription levels.

Pay-per-query

Charge per query or per token used, tracked automatically.

Institutional licensing

Provide unlimited access to universities, libraries, or companies.

Quality and attribution

1

Test your endpoints

Query your endpoints with common questions to ensure quality responses:
curl -X POST http://localhost:8080/api/v1/endpoints/magazine-archive/query \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What articles covered climate change in 1995?"}
    ]
  }'
2

Include attribution in responses

Configure your endpoint to return source citations with every answer.
3

Monitor usage analytics

Track which topics get the most queries to understand reader interests and inform future content.

Example: News publisher

A regional newspaper with 50 years of archives: Setup:
  • 500,000 articles indexed across 5 decades
  • 3 endpoints: Free (10 queries/day), Basic (10/month,500queries),Pro(10/month, 500 queries), Pro (50/month, unlimited)
  • Topics: local news, sports, politics, business, lifestyle
Results:
  • 1,000 free users discovering content
  • 150 paid subscribers ($2,000/month revenue)
  • 20 institutional licenses ($10,000/month)
  • Reduced customer support calls by 40%
  • New revenue stream from historical content

Learn more

Datasets

Learn about managing your content

Endpoints

Create and configure query endpoints

Policies

Set up access control and rate limiting

API reference

Complete API documentation

Need help getting started? Join our community or check the installation guide.

Build docs developers (and LLMs) love