Publishers and creators

Publishers and content creators can use Syft Space to make their content searchable and queryable through AI while maintaining full control over access, attribution, and monetization.

Why Syft Space for publishers

Maintain control

Your content stays on your infrastructure. Users get answers, not raw files.

Enable discovery

Make your content AI-searchable and discoverable through SyftHub.

Monetize access

Set pricing for queries and track usage through built-in accounting.

Preserve attribution

Every query response can include source attribution and links back to original content.

Use cases

News archives

Make decades of journalism AI-searchable for researchers, historians, and the public. Example workflow:

Upload historical archives (PDFs, articles, scans)
Create searchable endpoints by topic, time period, or publication
Set access policies (free for researchers, paid for commercial use)
Users query the archive through natural language

Benefits:

Monetize historical content without full republishing
Enable research without copyright concerns
Preserve journalistic heritage

Magazine back catalogs

Turn your magazine archives into a knowledge assistant for subscribers. Example workflow:

Index all magazine issues and articles
Create subscriber-only query endpoints
Users ask questions and get answers citing specific articles
Track which topics drive the most queries

Benefits:

Add value to existing subscriptions
Reduce customer support by letting AI answer common questions
Understand reader interests through query analytics

Educational content

Make textbooks and educational materials queryable for students and educators. Example workflow:

Upload textbooks, study guides, and course materials
Create grade-level or subject-specific endpoints
Students query for explanations, examples, and practice problems
Track usage to improve content

Benefits:

Create AI tutors from existing content
License content to educational institutions
Provide better student support

Getting started

Install Syft Space

Choose your installation method based on your infrastructure:

Desktop app

Best for small publishers testing the platform

Docker

Recommended for production deployments

Cloud VM

For large-scale operations

See the installation guide for details.

Add your content

Upload files
Connect existing database

# Create a dataset
curl -X POST http://localhost:8080/api/v1/datasets/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "magazine-archive",
    "dtype": "local_file",
    "configuration": {
      "httpPort": 8081,
      "grpcPort": 50051,
      "collectionName": "MagazineArchive",
      "ingestionPath": "/path/to/magazine/pdfs"
    }
  }'

Place your PDFs, documents, and articles in the ingestion path. Syft Space automatically indexes them.

If you already have a vector database:

curl -X POST http://localhost:8080/api/v1/datasets/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "existing-archive",
    "dtype": "weaviate_remote",
    "configuration": {
      "url": "https://your-weaviate-instance.com",
      "apiKey": "your-api-key",
      "collectionName": "YourCollection"
    }
  }'

Create queryable endpoints

Combine your content with an AI model:

# First, add a model
curl -X POST http://localhost:8080/api/v1/models/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "gpt-4",
    "dtype": "openai",
    "configuration": {
      "api_key": "sk-your-key",
      "model": "gpt-4-turbo"
    }
  }'

# Create an endpoint
curl -X POST http://localhost:8080/api/v1/endpoints/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Magazine Archive Query",
    "slug": "magazine-archive",
    "dataset_id": "<dataset-id>",
    "model_id": "<model-id>",
    "response_type": "both"
  }'

Set access policies

Control who can access your content and how:

Rate limiting
Access control
Usage tracking

curl -X POST http://localhost:8080/api/v1/policies/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Standard Rate Limit",
    "dtype": "rate_limit",
    "configuration": {
      "limit": "100/day",
      "scope": "user"
    },
    "endpoint_id": "<endpoint-id>"
  }'

curl -X POST http://localhost:8080/api/v1/policies/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Subscriber Only",
    "dtype": "access",
    "configuration": {
      "allowlist": ["[email protected]"],
      "mode": "allowlist"
    },
    "endpoint_id": "<endpoint-id>"
  }'

curl -X POST http://localhost:8080/api/v1/policies/ \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Track Usage",
    "dtype": "accounting",
    "configuration": {
      "track_tokens": true,
      "track_cost": true
    },
    "endpoint_id": "<endpoint-id>"
  }'

Publish to SyftHub

Make your endpoint discoverable:

Publish your endpoint:

curl -X POST http://localhost:8080/api/v1/endpoints/magazine-archive/publish \
  -H "Content-Type: application/json" \
  -d '{
    "visibility": "public",
    "description": "Query our complete magazine archive from 1980-2025"
  }'

Your endpoint is now discoverable at syfthub.openmined.org

Best practices

Content organization

Organize by topic or time period

Create separate datasets for different topics or time periods:

sports-1990-2000 - Sports coverage from the 90s
politics-archive - All political journalism
lifestyle-recent - Recent lifestyle content

This allows different pricing and access policies for different content.

Include metadata in documents

Add metadata to help AI provide better context:

Publication date
Author names
Article categories/tags
Original URL or reference

This metadata can be included in query responses.

Set up multiple endpoints

Create different endpoints for different audiences:

archive-free - Limited free access for discovery
archive-premium - Full access for subscribers
archive-research - Academic pricing with higher limits

Monetization strategies

Freemium model

Offer limited free queries to drive discovery, then charge for unlimited access.

Subscription tiers

Different query limits and features for different subscription levels.

Pay-per-query

Charge per query or per token used, tracked automatically.

Institutional licensing

Provide unlimited access to universities, libraries, or companies.

Quality and attribution

Test your endpoints

Query your endpoints with common questions to ensure quality responses:

curl -X POST http://localhost:8080/api/v1/endpoints/magazine-archive/query \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What articles covered climate change in 1995?"}
    ]
  }'

Include attribution in responses

Configure your endpoint to return source citations with every answer.

Monitor usage analytics

Track which topics get the most queries to understand reader interests and inform future content.

Example: News publisher

A regional newspaper with 50 years of archives: Setup:

500,000 articles indexed across 5 decades
3 endpoints: Free (10 queries/day), Basic ( $10/month, 500 queries), Pro ($ 50/month, unlimited)
Topics: local news, sports, politics, business, lifestyle

Results:

1,000 free users discovering content
150 paid subscribers ($2,000/month revenue)
20 institutional licenses ($10,000/month)
Reduced customer support calls by 40%
New revenue stream from historical content

Learn more

Datasets

Learn about managing your content

Endpoints

Create and configure query endpoints

Policies

Set up access control and rate limiting

API reference

Complete API documentation

Need help getting started? Join our community or check the installation guide.

Community

Use Cases

Publishers and creators

Why Syft Space for publishers

Maintain control

Enable discovery

Monetize access

Preserve attribution

Use cases

News archives

Magazine back catalogs

Educational content

Getting started

Desktop app

Docker

Cloud VM

Best practices

Content organization

Monetization strategies

Freemium model

Subscription tiers

Pay-per-query

Institutional licensing

Quality and attribution

Example: News publisher

Learn more

Datasets

Endpoints

Policies

API reference

Build docs developers (and LLMs) love

Community

Use Cases

​Why Syft Space for publishers

Maintain control

Enable discovery

Monetize access

Preserve attribution

​Use cases

​News archives

​Magazine back catalogs

​Educational content

​Getting started

Desktop app

Docker

Cloud VM

​Best practices

​Content organization

​Monetization strategies

Freemium model

Subscription tiers

Pay-per-query

Institutional licensing

​Quality and attribution

​Example: News publisher

​Learn more

Datasets

Endpoints

Policies

API reference

Build docs developers (and LLMs) love

Why Syft Space for publishers

Use cases

News archives

Magazine back catalogs

Educational content

Getting started

Best practices

Content organization

Monetization strategies

Quality and attribution

Example: News publisher

Learn more