Skip to main content

Overview

Perplexity provides AI models with built-in web search capabilities, delivering up-to-date information and citations. Access Perplexity’s Sonar models through Portkey for search-augmented AI responses. Base URL: https://api.perplexity.ai

Supported Features

  • ✅ Chat Completions
  • ✅ Streaming
  • ✅ Web Search Integration
  • ✅ Citations
  • ✅ Real-time Information
  • ❌ Embeddings
  • ❌ Function Calling
  • ❌ Vision
  • ❌ Image Generation

Quick Start

from portkey_ai import Portkey

client = Portkey(
    provider="perplexity-ai",
    Authorization="***"  # Your Perplexity API key
)

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[
        {"role": "user", "content": "What are the latest developments in AI in 2024?"}
    ]
)

print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "Latest news about SpaceX"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Available Models

Sonar Models (Online)

These models have access to the web and provide citations:
ModelContextDescriptionBest For
sonar-pro127KMost capable with searchComplex research, analysis
sonar127KFast with searchQuick lookups, Q&A

Chat Models (Offline)

Standard chat models without web access:
ModelContextDescriptionBest For
llama-3.1-sonar-huge-128k-online127KLargest online modelComplex queries
llama-3.1-sonar-large-128k-online127KBalanced onlineGeneral purpose
llama-3.1-sonar-small-128k-online127KFast onlineQuick responses
Perplexity excels at:
  • Real-time information - Access current web data
  • Fact-checking - Verify information with citations
  • Research - Comprehensive web search
  • Current events - Up-to-date news and developments
  • Citation tracking - Source attribution

Configuration Options

client = Portkey(
    provider="perplexity-ai",
    Authorization="***"  # Bearer token
)
HeaderDescriptionRequired
AuthorizationPerplexity API keyYes

Advanced Features

System Messages

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful research assistant. Always cite your sources."
        },
        {
            "role": "user",
            "content": "What are the benefits of renewable energy?"
        }
    ]
)

Temperature Control

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "Summarize recent AI breakthroughs"}],
    temperature=0.2,  # Lower for factual responses
    max_tokens=1000
)

Multi-turn Conversations

messages = [
    {"role": "user", "content": "What is quantum computing?"},
]

response = client.chat.completions.create(
    model="sonar-pro",
    messages=messages
)

# Add assistant response
messages.append({
    "role": "assistant",
    "content": response.choices[0].message.content
})

# Continue conversation
messages.append({
    "role": "user",
    "content": "What are its practical applications?"
})

response = client.chat.completions.create(
    model="sonar-pro",
    messages=messages
)

Use Cases

Research & Analysis

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[{
        "role": "user",
        "content": "Compare the environmental impact of electric vs hydrogen vehicles with recent data"
    }]
)

Current Events

response = client.chat.completions.create(
    model="sonar",
    messages=[{
        "role": "user",
        "content": "What happened in the tech industry this week?"
    }]
)

Fact Verification

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[{
        "role": "user",
        "content": "Verify: Is the global temperature rising? Provide recent data."
    }]
)

Fallback Configuration

Fallback to GPT-4 for non-search queries:
config = {
    "strategy": {"mode": "fallback"},
    "targets": [
        {
            "provider": "perplexity-ai",
            "api_key": "***",
            "override_params": {"model": "sonar-pro"}
        },
        {
            "provider": "openai",
            "api_key": "sk-***",
            "override_params": {"model": "gpt-4o"}
        }
    ]
}

client = Portkey().with_options(config=config)

Conditional Routing

Route to Perplexity for search queries, GPT-4 for others:
config = {
    "strategy": {"mode": "conditional"},
    "conditions": [
        {
            "query": {"metadata.needs_search": True},
            "then": "perplexity_target"
        },
        {
            "query": {"metadata.needs_search": False},
            "then": "openai_target"
        }
    ],
    "targets": {
        "perplexity_target": {
            "provider": "perplexity-ai",
            "api_key": "***",
            "override_params": {"model": "sonar-pro"}
        },
        "openai_target": {
            "provider": "openai",
            "api_key": "sk-***",
            "override_params": {"model": "gpt-4o"}
        }
    }
}

client = Portkey().with_options(config=config)

Error Handling

from portkey_ai.exceptions import (
    RateLimitError,
    APIError,
    AuthenticationError
)

try:
    response = client.chat.completions.create(
        model="sonar-pro",
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError as e:
    print(f"Rate limit: {e}")
except AuthenticationError as e:
    print(f"Invalid API key: {e}")
except APIError as e:
    print(f"API error: {e}")

Best Practices

  1. Use for current information - Leverage web search capabilities
  2. Lower temperature - For factual accuracy (0.0-0.3)
  3. Clear queries - Specific questions get better results
  4. Request citations - Ask model to cite sources
  5. Use sonar-pro - For important research
  6. Use sonar - For quick lookups
  7. Verify information - Always validate critical facts
  8. Implement caching - Cache responses to reduce costs

Limitations

  • No function calling support (yet)
  • No vision capabilities
  • No embeddings
  • Search results depend on web availability
  • May have slightly higher latency due to search

When to Use Perplexity

Use Perplexity when:
  • You need current, real-time information
  • Research requires web search
  • Citations and sources are important
  • Fact-checking is critical
  • Questions involve recent events
Use other providers when:
  • Information is static or historical
  • You need vision or function calling
  • Lower latency is critical
  • You’re doing creative writing

Pricing

Perplexity Pricing

View detailed pricing for Perplexity models

Conditional Routing

Route based on search needs

Fallbacks

Fallback configurations

Caching

Cache search results

Google Gemini

Alternative with search

Build docs developers (and LLMs) love