Query Engine

Overview

Query engines provide interfaces for querying indexed data and generating responses. They combine retrieval with response synthesis to answer questions over your data.

BaseQueryEngine

Abstract base class for all query engines.

import { BaseQueryEngine } from "@llamaindex/core/query-engine";

Methods

query

method

Query the engine with streaming or non-streaming responseNon-streaming:

query(params: NonStreamingQueryParams): Promise<EngineResponse>

Streaming:

query(params: StreamingQueryParams): Promise<AsyncIterable<EngineResponse>>

Show Parameters

query

string | QueryBundle

required

The query string or QueryBundle

stream

boolean

Whether to stream the response. Defaults to false

Show Returns

response

string

The generated response text

sourceNodes

NodeWithScore[]

Retrieved source nodes used to generate the response

metadata

Record<string, any>

Additional response metadata

retrieve

method

Retrieve relevant nodes without generating a response

retrieve(query: QueryType): Promise<NodeWithScore[]>

Show Parameters

query

string | QueryBundle

required

The retrieval query

Show Returns

nodes

NodeWithScore[]

Array of retrieved nodes with relevance scores

QueryBundle

Enhanced query object with optional embeddings.

type QueryBundle = {
  query: MessageContent;
  customEmbeddings?: string[];
  embeddings?: number[];
};

Show Fields

query

string | MessageContentDetail[]

The query text or multi-modal content

customEmbeddings

string[]

Custom embedding strings (optional)

embeddings

number[]

Pre-computed query embeddings (optional)

Usage Examples

Basic Query

import { VectorStoreIndex } from "llamaindex";
import { Document } from "@llamaindex/core/schema";

const documents = [
  new Document({ text: "LlamaIndex is a data framework for LLM applications." }),
  new Document({ text: "It provides tools for ingestion, indexing, and querying." })
];

const index = await VectorStoreIndex.fromDocuments(documents);
const queryEngine = index.asQueryEngine();

const response = await queryEngine.query({
  query: "What is LlamaIndex?"
});

console.log(response.response);
console.log(response.sourceNodes); // Nodes used to generate response

Streaming Query

const queryEngine = index.asQueryEngine();

const stream = await queryEngine.query({
  query: "What is LlamaIndex?",
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.response);
}

Retrieve Only

const nodes = await queryEngine.retrieve("LlamaIndex");

nodes.forEach(nodeWithScore => {
  console.log(`Score: ${nodeWithScore.score}`);
  console.log(`Text: ${nodeWithScore.node.text}`);
});

QueryBundle with Custom Embeddings

import { OpenAIEmbedding } from "@llamaindex/openai";

const embedModel = new OpenAIEmbedding();
const queryEmbedding = await embedModel.getTextEmbedding("What is LlamaIndex?");

const response = await queryEngine.query({
  query: {
    query: "What is LlamaIndex?",
    embeddings: queryEmbedding
  }
});

Advanced Query Engines

RetrieverQueryEngine

Query engine that uses a retriever and response synthesizer.

import { RetrieverQueryEngine } from "llamaindex";

const queryEngine = new RetrieverQueryEngine({
  retriever: index.asRetriever(),
  responseSynthesizer: responseSynthesizer
});

SubQuestionQueryEngine

Breaks down complex queries into sub-questions.

import { SubQuestionQueryEngine } from "llamaindex";

const queryEngine = new SubQuestionQueryEngine({
  queryEngineTools: [tool1, tool2],
  responseSynthesizer: responseSynthesizer
});

const response = await queryEngine.query({
  query: "Compare feature A and feature B"
});

RouterQueryEngine

Routes queries to appropriate query engines based on content.

import { RouterQueryEngine } from "llamaindex";

const queryEngine = new RouterQueryEngine({
  selector: selector,
  queryEngineTools: [docEngine, codeEngine]
});

Response Synthesis

Query engines use response synthesizers to generate answers:

import { ResponseSynthesizer, CompactAndRefine } from "llamaindex";

const synthesizer = new ResponseSynthesizer({
  responseBuilder: new CompactAndRefine(),
  streaming: true
});

const queryEngine = index.asQueryEngine({
  responseSynthesizer: synthesizer
});

Query Events

Query engines emit events during execution:

import { Settings } from "llamaindex";

Settings.callbackManager.on("query-start", (event) => {
  console.log("Query started:", event.query);
});

Settings.callbackManager.on("query-end", (event) => {
  console.log("Query completed:", event.response);
});

const response = await queryEngine.query({ query: "What is LlamaIndex?" });

Customization

Custom Query Engine

import { BaseQueryEngine } from "@llamaindex/core/query-engine";
import { EngineResponse } from "@llamaindex/core/schema";

class CustomQueryEngine extends BaseQueryEngine {
  async _query(query: string, stream?: boolean): Promise<EngineResponse> {
    // Custom query logic
    const nodes = await this.customRetrieve(query);
    const response = await this.customSynthesize(nodes);
    
    return {
      response: response,
      sourceNodes: nodes,
      metadata: {}
    };
  }
  
  private async customRetrieve(query: string) {
    // Custom retrieval logic
    return [];
  }
  
  private async customSynthesize(nodes: NodeWithScore[]) {
    // Custom synthesis logic
    return "Generated response";
  }
}

Best Practices

Use streaming for long responses: Improves perceived latency
Inspect source nodes: Verify response quality by checking retrieved sources
Configure retrieval parameters: Adjust top_k and similarity threshold for better results
Handle errors gracefully: Implement error handling for failed queries
Cache embeddings: Reuse QueryBundle with embeddings for repeated queries

Core Package

Main Package

LLM Providers

Vector Stores

Workflow & Tools

Overview

BaseQueryEngine

Methods

QueryBundle

Usage Examples

Basic Query

Streaming Query

Retrieve Only

QueryBundle with Custom Embeddings

Advanced Query Engines

RetrieverQueryEngine

SubQuestionQueryEngine

RouterQueryEngine

Response Synthesis

Query Events

Customization

Custom Query Engine

Best Practices

Build docs developers (and LLMs) love

Core Package

Main Package

LLM Providers

Vector Stores

Workflow & Tools

​Overview

​BaseQueryEngine

​Methods

​QueryBundle

​Usage Examples

​Basic Query

​Streaming Query

​Retrieve Only

​QueryBundle with Custom Embeddings

​Advanced Query Engines

​RetrieverQueryEngine

​SubQuestionQueryEngine

​RouterQueryEngine

​Response Synthesis

​Query Events

​Customization

​Custom Query Engine

​Best Practices

Build docs developers (and LLMs) love

Overview

BaseQueryEngine

Methods

QueryBundle

Usage Examples

Basic Query

Streaming Query

Retrieve Only

QueryBundle with Custom Embeddings

Advanced Query Engines

RetrieverQueryEngine

SubQuestionQueryEngine

RouterQueryEngine

Response Synthesis

Query Events

Customization

Custom Query Engine

Best Practices