Chat Completions - CheckThat AI

Overview

The /v1/chat/completions endpoint provides an OpenAI-compatible API with CheckThat AI’s advanced claim normalization and evaluation features. This endpoint is designed to be a drop-in replacement for OpenAI’s API with additional capabilities.

Endpoint

POST /v1/chat/completions

Authentication

Use Bearer token authentication with your LLM provider’s API key:

Authorization: Bearer sk-proj-your-api-key

See Authentication for details.

Standard OpenAI Parameters

All standard OpenAI parameters are supported:

messages

array

required

Array of message objects comprising the conversation.Each message must have:

role (string): system, user, or assistant
content (string): Message content

model

string

required

The LLM model to use. Supports all CheckThat AI models:

OpenAI: gpt-4o, gpt-5-2025-08-07, o3-2025-04-16, etc.
Anthropic: claude-sonnet-4-20250514, claude-opus-4-1-20250805
Google: gemini-2.5-pro, gemini-2.5-flash
xAI: grok-3, grok-4-0709, grok-3-mini
Together AI: meta-llama/Llama-3.3-70B-Instruct-Turbo-Free

stream

boolean

default:"false"

Whether to stream the response. When true, responses are sent as Server-Sent Events (SSE).

temperature

number

default:"1.0"

Sampling temperature (0.0 to 2.0). Higher values make output more random.

max_tokens

integer

Maximum number of tokens to generate.

max_completion_tokens

integer

Alternative to max_tokens for specifying maximum completion length.

top_p

number

default:"1.0"

Nucleus sampling parameter (0.0 to 1.0).

frequency_penalty

number

default:"0.0"

Penalty for token frequency (-2.0 to 2.0).

presence_penalty

number

default:"0.0"

Penalty for token presence (-2.0 to 2.0).

stop

string | array

Stop sequences where the API will stop generating.

integer

default:"1"

Number of completions to generate (1 to 128).

logprobs

boolean

default:"false"

Whether to return log probabilities.

top_logprobs

integer

Number of most likely tokens to return (0 to 20).

reasoning_effort

string

For reasoning models (o3, o4-mini): low, medium, or high.

response_format

object

Specify output format. Supports structured output for compatible models.

tools

array

List of tools the model can call.

tool_choice

string | object

Controls which tool is called.

CheckThat AI Custom Parameters

These parameters enable CheckThat AI’s advanced features:

refine_claims

boolean

default:"false"

Enable automatic claim refinement through iterative evaluation.When enabled, CheckThat AI will:

Generate initial response
Evaluate claim quality
Iteratively improve the claim until threshold is met

refine_model

string

Model to use for claim refinement. Can be different from the main model.Example: Use gpt-4o for refinement even if using gpt-3.5-turbo for generation.

refine_threshold

number

default:"0.5"

Quality threshold (0.0 to 1.0) for claim refinement.Claims scoring below this threshold will be refined.

refine_max_iters

integer

default:"3"

Maximum refinement iterations before stopping.

refine_metrics

any

DeepEval metrics to use for claim evaluation.Can be a metric name or custom metric instance.

checkthat_api_key

string

API key for the refinement model (if different from main API key).

Response Format

Non-Streaming Response

Standard OpenAI ChatCompletion object with CheckThat AI extensions:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1709567890,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Normalized claim: The Earth is a spherical planet..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 100,
    "total_tokens": 120
  },
  "evaluation_report": null,
  "refinement_metadata": {
    "metric_used": "AnswerRelevancyMetric",
    "threshold": 0.7,
    "refinement_model": "gpt-4o",
    "refinement_history": [
      {
        "claim_type": "original",
        "claim": "Original claim text",
        "score": 0.65,
        "feedback": "Claim needs clarification"
      },
      {
        "claim_type": "refined",
        "claim": "Improved claim text",
        "score": 0.85,
        "feedback": "Claim meets quality threshold"
      }
    ]
  },
  "checkthat_metadata": {
    "features_applied": {
      "refine_claims": true,
      "post_norm_eval_metrics": false,
      "save_eval_report": false
    }
  }
}

Response Fields

string

required

Unique identifier for the completion

object

string

required

Object type: chat.completion

created

integer

required

Unix timestamp of creation

model

string

required

Model used for generation

choices

array

required

Array of completion choices

Show Choice object

index

integer

Choice index

message

object

Generated message

Show Message object

role

string

Always assistant

content

string

Generated text content

finish_reason

string

Why generation stopped: stop, length, content_filter, etc.

usage

object

required

Token usage information

Show Usage object

prompt_tokens

integer

Tokens in the prompt

completion_tokens

integer

Tokens in the completion

total_tokens

integer

Total tokens used

evaluation_report

object | null

Post-normalization evaluation results (CheckThat AI extension)

refinement_metadata

object | null

Claim refinement metadata (CheckThat AI extension)

Show Refinement metadata

metric_used

string

Evaluation metric applied

threshold

number

Quality threshold used

refinement_model

string

Model used for refinement

refinement_history

array

History of refinement iterations

checkthat_metadata

object | null

Additional CheckThat AI metadata

Streaming Response

Server-Sent Events (SSE) format:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"The"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":" Earth"},"finish_reason":null}]}

...

data: [DONE]

Example Requests

Basic Request

curl -X POST https://api.checkthat-ai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-proj-your-openai-key" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Is this claim accurate: The Earth is flat."}
    ]
  }'

With Streaming

curl -X POST https://api.checkthat-ai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-proj-your-openai-key" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Normalize this claim: Vaccines cause autism."}
    ],
    "stream": true
  }'

curl -X POST https://api.checkthat-ai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-proj-your-openai-key" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Climate change is destroying the planet."}
    ],
    "refine_claims": true,
    "refine_model": "gpt-4o",
    "refine_threshold": 0.7,
    "refine_max_iters": 3,
    "checkthat_api_key": "sk-proj-your-openai-key"
  }'

With System Prompt

curl -X POST https://api.checkthat-ai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-proj-your-openai-key" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a fact-checking assistant. Normalize claims for verification."
      },
      {
        "role": "user",
        "content": "5G networks cause COVID-19."
      }
    ]
  }'

Python with OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="sk-proj-your-openai-key",
    base_url="https://api.checkthat-ai.com/v1"
)

# Basic request
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Normalize: The moon landing was faked."}
    ]
)

print(response.choices[0].message.content)

Python with Streaming

from openai import OpenAI

client = OpenAI(
    api_key="sk-proj-your-openai-key",
    base_url="https://api.checkthat-ai.com/v1"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Analyze this claim..."}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end='', flush=True)

from openai import OpenAI

client = OpenAI(
    api_key="sk-proj-your-openai-key",
    base_url="https://api.checkthat-ai.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "The Great Wall of China is visible from space."}
    ],
    # CheckThat AI custom parameters via extra_body
    extra_body={
        "refine_claims": True,
        "refine_model": "gpt-4o",
        "refine_threshold": 0.8,
        "refine_max_iters": 3,
        "checkthat_api_key": "sk-proj-your-openai-key"
    }
)

print(response.choices[0].message.content)

# Access refinement metadata
if hasattr(response, 'refinement_metadata'):
    print(f"\nRefinement iterations: {len(response.refinement_metadata.refinement_history)}")
    for iteration in response.refinement_metadata.refinement_history:
        print(f"  {iteration.claim_type}: score={iteration.score}")

JavaScript/TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-proj-your-openai-key',
  baseURL: 'https://api.checkthat-ai.com/v1'
});

async function normalizeClaim(claim) {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'user', content: claim }
    ]
  });

  return response.choices[0].message.content;
}

// Usage
const normalized = await normalizeClaim('The Earth is flat.');
console.log(normalized);

JavaScript with Streaming

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-proj-your-openai-key',
  baseURL: 'https://api.checkthat-ai.com/v1'
});

const stream = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Normalize this claim...' }
  ],
  stream: true
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || '';
  process.stdout.write(content);
}

Error Responses

Missing Authorization

Status Code: 401 Unauthorized

{
  "detail": "Not authenticated"
}

Invalid API Key

Status Code: 403 Forbidden

{
  "error": "Forbidden",
  "message": "Invalid API key",
  "type": "authentication_error"
}

Validation Error

Status Code: 422 Unprocessable Entity

{
  "error": "Validation Error",
  "details": [
    {
      "loc": ["body", "messages"],
      "msg": "field required",
      "type": "value_error.missing"
    }
  ]
}

Bad Request

Status Code: 400 Bad Request

{
  "error": "Bad Request",
  "message": "Invalid parameter value",
  "type": "validation_error"
}

Server Error

Status Code: 500 Internal Server Error

{
  "error": "Internal Server Error",
  "message": "An unexpected error occurred while processing your request",
  "type": "server_error"
}

When refine_claims: true is set, CheckThat AI follows this process:

Initial Generation - Generate response using specified model
Quality Evaluation - Evaluate claim using DeepEval metrics
Refinement Loop - If score < threshold:
- Generate feedback on claim quality
- Refine claim based on feedback
- Re-evaluate refined claim
- Repeat until threshold met or max iterations reached
Return Enhanced Response - Include refinement metadata

Supported evaluation metrics (via DeepEval):

AnswerRelevancyMetric - Measures answer relevance to query
FaithfulnessMetric - Checks factual consistency
ContextualPrecisionMetric - Evaluates precision in context
ContextualRecallMetric - Measures information recall
ContextualRelevancyMetric - Assesses contextual relevance
Custom metrics via DeepEval

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "COVID-19 vaccines alter DNA."}],
    extra_body={
        "refine_claims": True,
        "refine_model": "gpt-4o",
        "refine_threshold": 0.75,
        "refine_max_iters": 5,
        "refine_metrics": "AnswerRelevancyMetric"
    }
)

# Response includes refinement history
# refinement_history[0]: original claim (score: 0.60)
# refinement_history[1]: refined claim (score: 0.80)

Rate Limiting

Same rate limits as other endpoints:

10 requests per 60 seconds per IP address
Rate limit headers included in all responses
Streaming requests count as single request

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 9
X-RateLimit-Reset: 1709567890

Best Practices

Use Appropriate Models

Choose models based on your needs:

gpt-4o: High-quality general purpose
o3/o4-mini: Advanced reasoning tasks
claude-sonnet-4: Nuanced analysis
llama-3.3-70b: Free tier option

Enable Streaming for UX

Always use stream: true for user-facing applications to provide immediate feedback and better user experience.

Set Appropriate Thresholds

When using claim refinement:

Start with threshold: 0.7 for most cases
Use threshold: 0.8+ for high-accuracy requirements
Set max_iters: 3-5 to balance quality and cost

Handle Errors Gracefully

Implement retry logic with exponential backoff for transient errors. Log errors for debugging.

Monitor Token Usage

Track usage field in responses to monitor costs and optimize prompts. Refinement increases token usage.

Implementation Details

The endpoint uses a service layer architecture for clean separation of concerns:

Service Layer (`api/services/chat/completions.py`)

# services/chat/completions.py:347
def process_completion_request(
    api_key: str,
    validated_params: CheckThatCompletionCreateParams
) -> Union[ChatCompletion, ChatCompletionResponse]:
    # 1. Segregate OpenAI and CheckThat parameters
    openai_payload, checkthat_config = self.segregate_parameters(validated_params)
    
    # 2. Route to appropriate LLM client
    self.client = LLMRouter(self.model, api_key=api_key).getAPIClient()
    
    # 3. Handle streaming vs non-streaming
    if openai_payload.get('stream'):
        return self.handle_streaming_request(openai_payload, self.client)
    else:
        return self.handle_non_streaming_request(
            self.client, api_key, openai_payload, checkthat_config
        )

LLM Router (`api/_utils/LLMRouter.py`)

Automatically selects the correct client based on model:

# _utils/LLMRouter.py
class LLMRouter:
    def __init__(self, model: str, api_key: str):
        self.model = model
        self.api_key = api_key
    
    def getAPIClient(self):
        if model in OPENAI_MODELS:
            return OpenAIModel(model, api_key)
        elif model in ANTHROPIC_MODELS:
            return AnthropicModel(model, api_key)
        # ... etc

Comparison with Standard OpenAI API

Feature	OpenAI API	CheckThat AI
Chat completions	✅	✅
Streaming	✅	✅
Function calling	✅	✅
Multi-provider support	❌	✅
Claim refinement	❌	✅
Evaluation metrics	❌	✅
Refinement metadata	❌	✅
Drop-in compatible	N/A	✅

Chat Endpoint

Simplified streaming chat interface

Models

List all available models

Authentication

Authentication methods and setup

Batch Processing

Process multiple claims efficiently

Overview

Endpoints

Models

​Overview

​Endpoint

​Authentication

​Standard OpenAI Parameters

​CheckThat AI Custom Parameters

​Response Format

​Non-Streaming Response

​Response Fields

​Streaming Response

​Example Requests

​Basic Request

​With Streaming

​With Claim Refinement

​With System Prompt

​Python with OpenAI SDK

​Python with Streaming

​Python with Claim Refinement

​JavaScript/TypeScript

​JavaScript with Streaming

​Error Responses

​Missing Authorization

​Invalid API Key

​Validation Error

​Bad Request

​Server Error

​Claim Refinement Process

​Refinement Metrics

​Refinement Example

​Rate Limiting

​Best Practices

​Implementation Details

​Service Layer (api/services/chat/completions.py)

​LLM Router (api/_utils/LLMRouter.py)

​Comparison with Standard OpenAI API

​Related Endpoints

Chat Endpoint

Models

Authentication

Batch Processing

Build docs developers (and LLMs) love

Overview

Endpoint

Authentication

Standard OpenAI Parameters

CheckThat AI Custom Parameters

Response Format

Non-Streaming Response

Response Fields

Streaming Response

Example Requests

Basic Request

With Streaming

With Claim Refinement

With System Prompt

Python with OpenAI SDK

Python with Streaming

Python with Claim Refinement

JavaScript/TypeScript

JavaScript with Streaming

Error Responses

Missing Authorization

Invalid API Key

Validation Error

Bad Request

Server Error

Claim Refinement Process

Refinement Metrics

Refinement Example

Rate Limiting

Best Practices

Implementation Details

Service Layer (`api/services/chat/completions.py`)

LLM Router (`api/_utils/LLMRouter.py`)

Comparison with Standard OpenAI API

Related Endpoints