Overview
The OpenAI provider integrates OpenAI’s GPT models and embedding models with LlamaIndex.TS.
Installation
npm install @llamaindex/openai
OpenAI LLM
Basic Usage
import { OpenAI } from "@llamaindex/openai";
const llm = new OpenAI({
model: "gpt-4o",
temperature: 0.1,
maxTokens: 512
});
const response = await llm.chat({
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is LlamaIndex?" }
]
});
console.log(response.message.content);
Constructor Options
OpenAI model name. Supports GPT-4, GPT-3.5, O1, and more
Sampling temperature (0-2). Higher values mean more random outputs
Nucleus sampling parameter
Maximum tokens in the response
OpenAI API key (defaults to OPENAI_API_KEY env variable)
Custom API base URL (defaults to OPENAI_BASE_URL env variable)
Maximum number of retries for failed requests
Request timeout in milliseconds
reasoningEffort
'low' | 'medium' | 'high' | 'minimal'
Reasoning effort for O1 models
Additional OpenAI API parameters
Supported Models
- GPT-4 Series:
gpt-4o, gpt-4-turbo, gpt-4
- GPT-3.5:
gpt-3.5-turbo
- O1 Series:
o1-preview, o1-mini (reasoning models)
Streaming
const stream = await llm.chat({
messages: [{ role: "user", content: "Tell me a story" }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.delta);
}
Function Calling
import { tool } from "@llamaindex/core/tools";
import { z } from "zod";
const weatherTool = tool({
name: "get_weather",
description: "Get current weather for a location",
parameters: z.object({
location: z.string().describe("City name")
}),
execute: async ({ location }) => {
return `Weather in ${location}: 72°F and sunny`;
}
});
const response = await llm.chat({
messages: [{ role: "user", content: "What's the weather in San Francisco?" }],
tools: [weatherTool]
});
Structured Output
import { z } from "zod";
const schema = z.object({
name: z.string(),
age: z.number(),
email: z.string().email()
});
const result = await llm.exec({
messages: [{ role: "user", content: "Extract: John is 30, email [email protected]" }],
responseFormat: schema
});
console.log(result.object); // { name: "John", age: 30, email: "[email protected]" }
const response = await llm.chat({
messages: [
{
role: "user",
content: [
{ type: "text", text: "What's in this image?" },
{
type: "image_url",
image_url: { url: "data:image/jpeg;base64,/9j/4AAQSkZJRg..." }
}
]
}
]
});
PDF Support
const response = await llm.chat({
messages: [
{
role: "user",
content: [
{ type: "text", text: "Summarize this PDF" },
{
type: "file",
data: pdfBase64Data,
mimeType: "application/pdf"
}
]
}
]
});
Audio/Video (Realtime API)
OpenAI Live for realtime audio/video:
const llm = new OpenAI({
model: "gpt-4o-realtime-preview",
voiceName: "alloy"
});
const liveSession = llm.live;
await liveSession.connect({
audioConfig: {
stream: mediaStream,
onTrack: (track) => {
console.log("Audio track received", track);
}
}
});
OpenAI Embedding
Basic Usage
import { OpenAIEmbedding } from "@llamaindex/openai";
const embedModel = new OpenAIEmbedding({
model: "text-embedding-3-small"
});
const embedding = await embedModel.getTextEmbedding(
"LlamaIndex is a data framework"
);
console.log(embedding.length); // 1536
Constructor Options
model
string
default:"text-embedding-3-small"
Embedding model name
Output dimensions (text-embedding-3 models only)
Batch size for embedding requests
Supported Models
text-embedding-3-small: 1536 dimensions (default), up to 512
text-embedding-3-large: 3072 dimensions (default), up to 256
text-embedding-ada-002: 1536 dimensions (legacy)
Batch Embedding
const texts = [
"First document",
"Second document",
"Third document"
];
const embeddings = await embedModel.getTextEmbeddingsBatch(texts, {
logProgress: true
});
console.log(embeddings.length); // 3
Custom Dimensions
const embedModel = new OpenAIEmbedding({
model: "text-embedding-3-small",
dimensions: 512 // Reduce dimensions for lower storage costs
});
Configuration
Environment Variables
OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.openai.com/v1 # Optional
Global Settings
import { Settings } from "llamaindex";
import { OpenAI, OpenAIEmbedding } from "@llamaindex/openai";
Settings.llm = new OpenAI({ model: "gpt-4o" });
Settings.embedModel = new OpenAIEmbedding({ model: "text-embedding-3-small" });
Custom Base URL
const llm = new OpenAI({
baseURL: "https://custom-proxy.com/v1",
apiKey: "your-key"
});
O1 Reasoning Models
const llm = new OpenAI({
model: "o1-preview",
reasoningEffort: "high" // or "medium", "low"
});
const response = await llm.chat({
messages: [{ role: "user", content: "Solve this complex problem..." }]
});
Note: O1 models don’t support temperature or streaming.
Error Handling
try {
const response = await llm.chat({ messages });
} catch (error) {
if (error.status === 429) {
console.error("Rate limit exceeded");
} else if (error.status === 401) {
console.error("Invalid API key");
} else {
console.error("Error:", error.message);
}
}
Best Practices
- Set appropriate temperature: Lower (0-0.3) for factual tasks, higher (0.7-1.0) for creative tasks
- Use function calling: Better than prompt engineering for structured outputs
- Stream long responses: Improves user experience
- Batch embeddings: More efficient than individual calls
- Monitor costs: Track token usage with response.raw.usage
- Use correct model: GPT-4 for complex tasks, GPT-3.5 for simple ones
See Also