Chat Engine

Overview

Chat engines enable conversational interactions with your data, maintaining chat history and context across multiple turns.

BaseChatEngine

Abstract base class for all chat engines.

import { BaseChatEngine } from "@llamaindex/core/chat-engine";

Properties

chatHistory

ChatMessage[] | Promise<ChatMessage[]>

The conversation history

Methods

chat

method

Send a message and get a responseNon-streaming:

chat(params: NonStreamingChatEngineParams): Promise<EngineResponse>

Streaming:

chat(params: StreamingChatEngineParams): Promise<AsyncIterable<EngineResponse>>

Show Parameters

message

string | MessageContentDetail[]

required

The user message (text or multi-modal)

stream

boolean

Whether to stream the response

chatHistory

ChatMessage[] | Memory

Optional custom chat history or memory

chatOptions

object

Provider-specific chat options

Show Returns

response

string

The assistant’s response text

sourceNodes

NodeWithScore[]

Retrieved source nodes (context chat engine only)

metadata

Record<string, any>

Additional response metadata

SimpleChatEngine

Basic chat engine without retrieval, just conversational LLM.

import { SimpleChatEngine } from "@llamaindex/core/chat-engine";
import { OpenAI } from "@llamaindex/openai";

Example

const llm = new OpenAI({ model: "gpt-4" });
const chatEngine = new SimpleChatEngine({ llm });

const response1 = await chatEngine.chat({
  message: "Hello! My name is Alice."
});
console.log(response1.response); // "Hello Alice! How can I help you?"

const response2 = await chatEngine.chat({
  message: "What's my name?"
});
console.log(response2.response); // "Your name is Alice."

ContextChatEngine

Chat engine with retrieval - retrieves relevant context for each message.

import { ContextChatEngine } from "@llamaindex/core/chat-engine";

Constructor Options

retriever

BaseRetriever

required

Retriever for fetching relevant context

llm

LLM

Language model (defaults to Settings.llm)

chatHistory

ChatMessage[]

Initial chat history

systemPrompt

string

System prompt for the chat

contextSystemPrompt

string

Template for injecting retrieved context

Example

import { VectorStoreIndex } from "llamaindex";
import { Document } from "@llamaindex/core/schema";

const documents = [
  new Document({ text: "The company was founded in 2020." }),
  new Document({ text: "Our main product is a data framework." })
];

const index = await VectorStoreIndex.fromDocuments(documents);
const chatEngine = index.asChatEngine();

const response = await chatEngine.chat({
  message: "When was the company founded?"
});

console.log(response.response); // "The company was founded in 2020."
console.log(response.sourceNodes); // Retrieved context nodes

Streaming Chat

const chatEngine = index.asChatEngine();

const stream = await chatEngine.chat({
  message: "Tell me about the company",
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.response);
}

Chat engines support images and other media:

const response = await chatEngine.chat({
  message: [
    { type: "text", text: "What's in this image?" },
    {
      type: "image_url",
      image_url: { url: "data:image/jpeg;base64,..." }
    }
  ]
});

Custom Chat History

Using ChatMemoryBuffer

import { ChatMemoryBuffer } from "@llamaindex/core/memory";

const memory = new ChatMemoryBuffer({ tokenLimit: 3000 });

const response = await chatEngine.chat({
  message: "Hello",
  chatHistory: memory
});

Manual Chat History

const customHistory: ChatMessage[] = [
  { role: "user", content: "Previous question" },
  { role: "assistant", content: "Previous answer" }
];

const response = await chatEngine.chat({
  message: "Follow-up question",
  chatHistory: customHistory
});

System Prompts

Setting System Prompt

const chatEngine = index.asChatEngine({
  systemPrompt: "You are a helpful assistant that always speaks in rhymes."
});

Custom Context Template

const chatEngine = index.asChatEngine({
  contextSystemPrompt: `
    Use the following context to answer the question.
    If you don't know, say so.
    
    Context:
    {context}
    
    Question: {query}
  `
});

Chat History Management

Accessing Chat History

const response = await chatEngine.chat({
  message: "Hello"
});

const history = await chatEngine.chatHistory;
console.log(history);
// [
//   { role: "user", content: "Hello" },
//   { role: "assistant", content: "Hi! How can I help you?" }
// ]

Resetting Chat History

import { SimpleChatEngine } from "@llamaindex/core/chat-engine";

const chatEngine = new SimpleChatEngine({ llm });

// Chat history is stored in the engine
await chatEngine.chat({ message: "Message 1" });
await chatEngine.chat({ message: "Message 2" });

// Reset by creating new engine
const newChatEngine = new SimpleChatEngine({ llm });

Retrieval Configuration

const chatEngine = index.asChatEngine({
  retriever: index.asRetriever({
    similarityTopK: 5,
    mode: "default"
  })
});

Custom Chat Engine

import { BaseChatEngine } from "@llamaindex/core/chat-engine";
import { EngineResponse } from "@llamaindex/core/schema";

class CustomChatEngine extends BaseChatEngine {
  private history: ChatMessage[] = [];
  
  async chat(params: NonStreamingChatEngineParams): Promise<EngineResponse> {
    const { message, chatHistory } = params;
    
    // Use provided history or internal history
    const messages = chatHistory ?? this.history;
    
    // Add user message
    messages.push({ role: "user", content: message });
    
    // Generate response (custom logic)
    const response = await this.generateResponse(messages);
    
    // Add assistant message
    const assistantMessage = { role: "assistant", content: response };
    messages.push(assistantMessage);
    
    // Update internal history
    this.history = messages;
    
    return {
      response,
      sourceNodes: [],
      metadata: {}
    };
  }
  
  get chatHistory() {
    return this.history;
  }
  
  private async generateResponse(messages: ChatMessage[]): Promise<string> {
    // Custom response generation
    return "Response";
  }
}

Best Practices

Use context chat engine for RAG: Retrieves relevant information for each turn
Manage token limits: Use ChatMemoryBuffer to prevent context overflow
Provide clear system prompts: Guide the assistant’s behavior
Stream long responses: Better user experience for lengthy answers
Reset history periodically: Prevent context from becoming too large or stale
Include source nodes: Track which documents informed the response

Core Package

Main Package

LLM Providers

Vector Stores

Workflow & Tools

Overview

BaseChatEngine

Properties

Methods

SimpleChatEngine

Example

ContextChatEngine

Constructor Options

Example

Streaming Chat

Custom Chat History

Using ChatMemoryBuffer

Manual Chat History

System Prompts

Setting System Prompt

Custom Context Template

Chat History Management

Accessing Chat History

Resetting Chat History

Retrieval Configuration

Custom Chat Engine

Best Practices

Build docs developers (and LLMs) love

Core Package

Main Package

LLM Providers

Vector Stores

Workflow & Tools

​Overview

​BaseChatEngine

​Properties

​Methods

​SimpleChatEngine

​Example

​ContextChatEngine

​Constructor Options

​Example

​Streaming Chat

​Multi-modal Chat

​Custom Chat History

​Using ChatMemoryBuffer

​Manual Chat History

​System Prompts

​Setting System Prompt

​Custom Context Template

​Chat History Management

​Accessing Chat History

​Resetting Chat History

​Retrieval Configuration

​Custom Chat Engine

​Best Practices

Build docs developers (and LLMs) love

Overview

BaseChatEngine

Properties

Methods

SimpleChatEngine

Example

ContextChatEngine

Constructor Options

Example

Streaming Chat

Multi-modal Chat

Custom Chat History

Using ChatMemoryBuffer

Manual Chat History

System Prompts

Setting System Prompt

Custom Context Template

Chat History Management

Accessing Chat History

Resetting Chat History

Retrieval Configuration

Custom Chat Engine

Best Practices