Query endpoint

curl --request POST \
  --url http://localhost:8080/api/v1/endpoints/legal-qa/query \
  --header 'Authorization: Bearer SYFT_HUB_TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "similarity_threshold": 0.5,
    "limit": 5,
    "max_tokens": 100,
    "temperature": 0.7
  }'

{
  "summary": {
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "model": "gpt-4",
    "message": {
      "role": "assistant",
      "content": "The capital of France is Paris.",
      "tokens": 8
    },
    "finish_reason": "stop",
    "usage": {
      "prompt_tokens": 10,
      "completion_tokens": 8,
      "total_tokens": 18
    },
    "cost": 0.0025,
    "provider_info": {
      "api_version": "v1",
      "response_time_ms": 150
    }
  },
  "references": {
    "documents": [
      {
        "document_id": "doc1",
        "content": "Paris is the capital of France.",
        "metadata": {
          "source": "wikipedia"
        },
        "similarity_score": 0.95
      }
    ],
    "provider_info": {
      "search_engine": "weaviate",
      "response_time_ms": 50
    },
    "cost": 0.001
  }
}

POST

api

endpoints

{slug}

query

curl --request POST \
  --url http://localhost:8080/api/v1/endpoints/legal-qa/query \
  --header 'Authorization: Bearer SYFT_HUB_TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "similarity_threshold": 0.5,
    "limit": 5,
    "max_tokens": 100,
    "temperature": 0.7
  }'

{
  "summary": {
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "model": "gpt-4",
    "message": {
      "role": "assistant",
      "content": "The capital of France is Paris.",
      "tokens": 8
    },
    "finish_reason": "stop",
    "usage": {
      "prompt_tokens": 10,
      "completion_tokens": 8,
      "total_tokens": 18
    },
    "cost": 0.0025,
    "provider_info": {
      "api_version": "v1",
      "response_time_ms": 150
    }
  },
  "references": {
    "documents": [
      {
        "document_id": "doc1",
        "content": "Paris is the capital of France.",
        "metadata": {
          "source": "wikipedia"
        },
        "similarity_score": 0.95
      }
    ],
    "provider_info": {
      "search_engine": "weaviate",
      "response_time_ms": 50
    },
    "cost": 0.001
  }
}

Query an endpoint to get responses from your RAG system. This is the core endpoint that orchestrates dataset search, model chat, and policy enforcement.

This is a public endpoint that requires a SyftHub satellite token for authentication. The user’s identity is automatically extracted from the token.

Authentication

Requires a valid SyftHub satellite token in the Authorization header. The user’s email is automatically extracted and verified from this token.

Path parameters

slug

string

required

The unique slug of the endpoint to query.

Request body

messages

string | array

required

Either a simple string query or an array of chat messages. For chat format, each message should have role (user/assistant/system) and content fields.

similarity_threshold

number

default:"0.5"

Minimum similarity score (0.0-1.0) for dataset search results. Higher values return only more relevant documents.

limit

number

default:"5"

Maximum number of documents to return from dataset search.

include_metadata

boolean

default:"true"

Whether to include document metadata in the response.

max_tokens

number

default:"100"

Maximum number of tokens to generate in the model response.

temperature

number

default:"0.7"

Sampling temperature for model generation (0.0-2.0). Higher values make output more random.

stop_sequences

array

default:"[\"\\n\"]"

Array of strings that will stop generation when encountered.

stream

boolean

default:"false"

Whether to stream the response (for real-time generation).

presence_penalty

number

default:"0.0"

Penalty for tokens based on whether they appear in the text (-2.0 to 2.0).

frequency_penalty

number

default:"0.0"

Penalty for tokens based on their frequency in the text (-2.0 to 2.0).

extras

object

default:"{}"

Additional configuration options for advanced use cases. Can include reference_options and summarize_options.

transaction_token

string

default:"null"

Optional transaction token for accounting and billing purposes.

Response

summary

object

Generated response from the model (only present if model is configured).

Show summary properties

string

required

Unique identifier for the response.

model

string

required

Name of the model used for generation.

message

object

required

The generated message.

Show message properties

role

string

required

Role of the message sender (typically “assistant”).

content

string

required

The generated text content.

tokens

number

required

Number of tokens in the message.

finish_reason

string

required

Reason for completion (e.g., “stop”, “length”).

usage

object

required

Token usage information.

Show usage properties

prompt_tokens

number

required

Number of tokens in the prompt.

completion_tokens

number

required

Number of tokens in the completion.

total_tokens

number

required

Total number of tokens used.

cost

number

required

Cost of the generation in USD.

provider_info

object

required

Provider-specific metadata.

Show provider_info properties

api_version

string

API version used by the provider.

response_time_ms

number

Response time in milliseconds.

logprobs

object

Log probabilities for tokens (if requested).

references

object

Reference documents from dataset search (only present if dataset is configured).

Show references properties

documents

array

required

Array of matching documents.

Show document properties

document_id

string

required

Unique identifier for the document.

content

string

required

The document content/text.

metadata

object

required

Document metadata (e.g., source, author, date).

similarity_score

number

required

Similarity score between query and document (0.0-1.0).

provider_info

object

required

Search provider information.

Show provider_info properties

search_engine

string

Name of the search engine used (e.g., “weaviate”).

response_time_ms

number

Search response time in milliseconds.

cost

number

required

Cost of the search operation in USD.

curl --request POST \
  --url http://localhost:8080/api/v1/endpoints/legal-qa/query \
  --header 'Authorization: Bearer SYFT_HUB_TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "similarity_threshold": 0.5,
    "limit": 5,
    "max_tokens": 100,
    "temperature": 0.7
  }'

{
  "summary": {
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "model": "gpt-4",
    "message": {
      "role": "assistant",
      "content": "The capital of France is Paris.",
      "tokens": 8
    },
    "finish_reason": "stop",
    "usage": {
      "prompt_tokens": 10,
      "completion_tokens": 8,
      "total_tokens": 18
    },
    "cost": 0.0025,
    "provider_info": {
      "api_version": "v1",
      "response_time_ms": 150
    }
  },
  "references": {
    "documents": [
      {
        "document_id": "doc1",
        "content": "Paris is the capital of France.",
        "metadata": {
          "source": "wikipedia"
        },
        "similarity_score": 0.95
      }
    ],
    "provider_info": {
      "search_engine": "weaviate",
      "response_time_ms": 50
    },
    "cost": 0.001
  }
}

This endpoint enforces all policies attached to it. Queries may be rejected based on rate limits, access controls, or other policy rules.

Create endpoint

Publish endpoint

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Overview

Endpoints

Datasets

Models

Policies

Settings

Authentication

Path parameters

Request body

Response

Build docs developers (and LLMs) love

Overview

Endpoints

Datasets

Models

Policies

Settings

​Authentication

​Path parameters

​Request body

​Response

Build docs developers (and LLMs) love

Authentication

Path parameters

Request body

Response