Skip to main content

Overview

LlamaIndex.TS supports a wide range of LLM and embedding providers through dedicated packages. Each provider implements the common BaseLLM or BaseEmbedding interface, allowing you to switch providers with minimal code changes.

Provider Packages

All providers are published as separate npm packages following the pattern @llamaindex/<provider-name>:
npm install @llamaindex/openai
npm install @llamaindex/anthropic
npm install @llamaindex/google
Installing only the providers you need keeps your bundle size small.

Major LLM Providers

OpenAI

OpenAI provides GPT-4 and GPT-3.5 models with excellent performance and tool calling support.Installation:
npm install @llamaindex/openai
Environment:
export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://api.openai.com/v1"  # Optional
LLM Usage:
import { OpenAI } from "@llamaindex/openai";

const llm = new OpenAI({
  model: "gpt-4o",        // or gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo
  temperature: 0.7,
  maxTokens: 1024,
  apiKey: process.env.OPENAI_API_KEY, // Optional if env var set
});

const response = await llm.chat({
  messages: [{ role: "user", content: "Hello!" }],
});
Embedding Usage:
import { OpenAIEmbedding } from "@llamaindex/openai";

const embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small", // or text-embedding-3-large
  dimensions: 1536, // Optional: customize dimensions
});

const embedding = await embedModel.getTextEmbedding("Hello world");
Supported Features:
  • Function calling / tool use
  • Streaming responses
  • Vision (GPT-4 Vision models)
  • JSON mode / structured output
  • Multi-modal inputs (images, files)
Popular Models:
  • gpt-4o: Latest flagship model
  • gpt-4o-mini: Fast, cost-effective
  • gpt-4-turbo: Previous generation flagship
  • gpt-3.5-turbo: Fast and affordable

Additional Providers

Deepseek

npm install @llamaindex/deepseek
import { DeepSeek } from "@llamaindex/deepseek";

const llm = new DeepSeek({
  model: "deepseek-chat",
  apiKey: process.env.DEEPSEEK_API_KEY,
});

Fireworks AI

Fast inference for open-source models.
npm install @llamaindex/fireworks
import { FireworksLLM } from "@llamaindex/fireworks";

const llm = new FireworksLLM({
  model: "accounts/fireworks/models/llama-v3-70b-instruct",
  apiKey: process.env.FIREWORKS_API_KEY,
});

Together AI

Run open-source models at scale.
npm install @llamaindex/together
import { TogetherLLM } from "@llamaindex/together";

const llm = new TogetherLLM({
  model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
  apiKey: process.env.TOGETHER_API_KEY,
});

Perplexity

Online LLMs with real-time web search.
npm install @llamaindex/perplexity
import { PerplexityLLM } from "@llamaindex/perplexity";

const llm = new PerplexityLLM({
  model: "llama-3.1-sonar-large-128k-online",
  apiKey: process.env.PERPLEXITY_API_KEY,
});

Replicate

Run open-source models via cloud API.
npm install @llamaindex/replicate
import { ReplicateLLM } from "@llamaindex/replicate";

const llm = new ReplicateLLM({
  model: "meta/llama-2-70b-chat",
  apiKey: process.env.REPLICATE_API_KEY,
});

xAI

Grok models from xAI.
npm install @llamaindex/xai
import { XAI } from "@llamaindex/xai";

const llm = new XAI({
  model: "grok-beta",
  apiKey: process.env.XAI_API_KEY,
});

Vercel AI

Integration with Vercel AI SDK.
npm install @llamaindex/vercel
import { VercelLLM } from "@llamaindex/vercel";
import { openai } from "@ai-sdk/openai";

const llm = new VercelLLM({
  model: openai("gpt-4o"),
});

AWS Bedrock

LLMs through AWS Bedrock.
npm install @llamaindex/aws
import { BedrockLLM } from "@llamaindex/aws";

const llm = new BedrockLLM({
  model: "anthropic.claude-3-sonnet-20240229-v1:0",
  region: "us-east-1",
});

vLLM

Self-hosted high-performance inference.
npm install @llamaindex/vllm
import { VLLM } from "@llamaindex/vllm";

const llm = new VLLM({
  model: "mistralai/Mistral-7B-Instruct-v0.2",
  baseURL: "http://localhost:8000/v1",
});

Portkey AI

LLM gateway with observability and routing.
npm install @llamaindex/portkey-ai
import { PortkeyLLM } from "@llamaindex/portkey-ai";

const llm = new PortkeyLLM({
  apiKey: process.env.PORTKEY_API_KEY,
  virtualKey: "openai-virtual-key",
});

Embedding Providers

import { OpenAIEmbedding } from "@llamaindex/openai";

const embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small",
  dimensions: 1536,
});
Models:
  • text-embedding-3-small: 1536 dims (customizable)
  • text-embedding-3-large: 3072 dims (customizable)
  • text-embedding-ada-002: 1536 dims (legacy)

Provider Comparison

ProviderLLMEmbeddingsFunction CallingVisionLocalCost
OpenAI$$$
Anthropic$$$
Google$$
Ollama⚠️Free
Groq$
Mistral$$
Voyage AIN/A$
Cohere$$

Switching Providers

Thanks to the unified interface, switching providers is simple:
import { Settings } from "llamaindex";
import { OpenAI } from "@llamaindex/openai";
import { Anthropic } from "@llamaindex/anthropic";

// Use OpenAI
Settings.llm = new OpenAI({ model: "gpt-4o" });

// Switch to Anthropic
Settings.llm = new Anthropic({ model: "claude-3-7-sonnet" });

// All your code continues to work!
const index = await VectorStoreIndex.fromDocuments(docs);
const response = await index.query({ query: "What is RAG?" });

Best Practices

# .env file
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
LlamaIndex.TS automatically detects these standard environment variables.
# Good: Only install providers you use
npm install @llamaindex/openai @llamaindex/anthropic

# Avoid: Installing everything
npm install @llamaindex/*
This keeps your node_modules small and deploy times fast.
  • Development: Use cheaper models like gpt-4o-mini or Ollama
  • Production: Upgrade to gpt-4o or claude-3-7-sonnet for quality
  • Embeddings: text-embedding-3-small offers great quality/cost ratio
For sensitive data, use Ollama or vLLM to run models on your infrastructure:
const llm = new Ollama({ model: "llama3.1" });
// Data never leaves your servers
Different providers excel at different tasks:
// Claude for long-form reasoning
const claudeEngine = index.asQueryEngine({ 
  llm: new Anthropic({ model: "claude-3-7-sonnet" }) 
});

// GPT-4 for structured output
const gptEngine = index.asQueryEngine({ 
  llm: new OpenAI({ model: "gpt-4o" }) 
});

Next Steps

LLMs

Learn about the LLM interface and capabilities

Embeddings

Understand embedding models for semantic search

Build docs developers (and LLMs) love