LangChain Integration

Integrate Portkey with LangChain to access 250+ LLMs while leveraging LangChain’s powerful abstractions and Portkey’s production-grade routing capabilities.

Overview

Portkey brings production readiness to LangChain applications:

Connect to 250+ models through a unified API
View 42+ metrics & logs for all requests
Enable semantic cache to reduce latency & costs
Implement automatic retries & fallbacks
Add custom tags for better tracking and analysis

Installation

pip install portkey-ai langchain-openai

Quick Start

Since Portkey is fully compatible with the OpenAI signature, you can connect through LangChain’s ChatOpenAI interface.

Get Your API Keys

Configure ChatOpenAI

Set the base_url to Portkey’s gateway and add Portkey headers:

from langchain_openai import ChatOpenAI
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL

portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    provider="openai"
)

llm = ChatOpenAI(
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

Use LangChain Normally

response = llm.invoke("What is the meaning of life?")
print(response.content)

Switching Providers

One of Portkey’s key benefits is easy provider switching. Change providers with just 2 lines:

portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    provider="openai"
)

llm = ChatOpenAI(
    model="gpt-4",
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

Advanced Routing

Use Portkey’s gateway configs for load balancing, fallbacks, and retries.

Load Balancing

Distribute traffic between multiple models or providers:

config = {
    "strategy": {
        "mode": "loadbalance"
    },
    "targets": [
        {
            "virtual_key": "openai-virtual-key",
            "override_params": {"model": "gpt-3.5-turbo"},
            "weight": 0.5
        },
        {
            "virtual_key": "together-virtual-key",
            "override_params": {"model": "meta-llama/Llama-3-8b-chat-hf"},
            "weight": 0.5
        }
    ]
}

portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    config=config
)

llm = ChatOpenAI(
    api_key="X",  # Not used when config has virtual keys
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

Fallback Strategy

Automatically fallback to another provider on failures:

config = {
    "strategy": {
        "mode": "fallback"
    },
    "targets": [
        {"virtual_key": "openai-virtual-key"},
        {"virtual_key": "anthropic-virtual-key"}
    ]
}

portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    config=config
)

Automatic Retries

config = {
    "retry": {
        "attempts": 5,
        "on_status_codes": [429, 500, 502, 503]
    }
}

LangChain Chains and Agents

Portkey works seamlessly with LangChain chains and agents:

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL

# Configure LLM with Portkey
portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    provider="openai"
)

llm = ChatOpenAI(
    model="gpt-4",
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

# Create a chain
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?"
)

chain = LLMChain(llm=llm, prompt=prompt)
result = chain.run("eco-friendly water bottles")
print(result)

Adding Metadata and Tracing

Enhance observability with metadata and custom traces:

from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL

portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    provider="openai",
    metadata={
        "user_id": "user_123",
        "environment": "production",
        "session_id": "session_456"
    },
    trace_id="custom-trace-id"
)

llm = ChatOpenAI(
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

Caching

Enable semantic caching to reduce costs and latency:

config = {
    "cache": {
        "mode": "semantic",
        "max_age": 3600  # Cache for 1 hour
    }
}

portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    provider="openai",
    config=config
)

Streaming

Portkey supports streaming responses:

from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = ChatOpenAI(
    model="gpt-4",
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()],
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

response = llm.invoke("Tell me a story")

Monitoring and Analytics

All requests through Portkey are automatically logged. View detailed analytics in the Portkey dashboard:

Request/response logs
Token usage and costs
Latency metrics
Error rates
Custom metadata filters

Best Practices

Use Virtual Keys

Store your provider API keys as Virtual Keys in Portkey for better security and key rotation.

Implement Fallbacks

Always configure fallback providers for production applications to handle outages.

Enable Caching

Use semantic caching for FAQ and support use cases to reduce costs by up to 50%.

Add Metadata

Tag requests with user IDs, session IDs, and environment info for better debugging.

Example: Complete RAG Application

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL

# Configure Portkey headers
portkey_headers = createHeaders(
    api_key="your-portkey-api-key",
    provider="openai",
    metadata={"application": "rag-demo"}
)

# Load and split documents
loader = TextLoader("data.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

# Create embeddings with Portkey
embeddings = OpenAIEmbeddings(
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

# Create vector store
vectorstore = Chroma.from_documents(texts, embeddings)

# Create LLM with Portkey
llm = ChatOpenAI(
    model="gpt-4",
    api_key="your-openai-api-key",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=portkey_headers
)

# Create QA chain
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Query
result = qa.run("What is the main topic of the document?")
print(result)

Resources

Questions? Join our Discord community or reach out to support.

Agent Frameworks

SDKs

Overview

Installation

Quick Start

Switching Providers

Advanced Routing

Load Balancing

Fallback Strategy

Automatic Retries

LangChain Chains and Agents

Adding Metadata and Tracing

Caching

Streaming

Monitoring and Analytics

Best Practices

Example: Complete RAG Application

Resources

Build docs developers (and LLMs) love

Agent Frameworks

SDKs

​Overview

​Installation

​Quick Start

​Switching Providers

​Advanced Routing

​Load Balancing

​Fallback Strategy

​Automatic Retries

​LangChain Chains and Agents

​Adding Metadata and Tracing

​Caching

​Streaming

​Monitoring and Analytics

​Best Practices

​Example: Complete RAG Application

​Resources

Build docs developers (and LLMs) love

Overview

Installation

Quick Start

Switching Providers

Advanced Routing

Load Balancing

Fallback Strategy

Automatic Retries

LangChain Chains and Agents

Adding Metadata and Tracing

Caching

Streaming

Monitoring and Analytics

Best Practices

Example: Complete RAG Application

Resources