Azure AI Search

Overview

Azure AI Search provides enterprise-grade vector search with hybrid search capabilities, combining vector similarity with full-text search and semantic ranking.

Installation

npm install @llamaindex/azure @azure/search-documents @azure/identity

Basic Usage

import { AzureAISearchVectorStore, IndexManagement } from "@llamaindex/azure";
import { VectorStoreIndex, Document } from "llamaindex";

const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "my-index",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS,
  embeddingDimensionality: 1536
});

const documents = [
  new Document({ text: "LlamaIndex is a data framework." }),
  new Document({ text: "Azure AI Search provides vector search." })
];

const index = await VectorStoreIndex.fromDocuments(documents, {
  storageContext: { vectorStore }
});

const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
  query: "What is Azure AI Search?"
});

Constructor Options

Authentication

endpoint

string

Azure AI Search endpoint (defaults to AZURE_AI_SEARCH_ENDPOINT env var)

key

string

Azure AI Search admin key (defaults to AZURE_AI_SEARCH_KEY env var)

credential

AzureKeyCredential | DefaultAzureCredential | ManagedIdentityCredential

Azure credential object (alternative to key)

Index Configuration

indexName

string

required

Name of the search index

indexManagement

IndexManagement

default:"NoValidation"

Index validation strategy:

IndexManagement.NO_VALIDATION - No validation
IndexManagement.VALIDATE_INDEX - Validate index exists
IndexManagement.CREATE_IF_NOT_EXISTS - Auto-create index

embeddingDimensionality

number

default:1536

Vector embedding dimensions

vectorAlgorithmType

KnownVectorSearchAlgorithmKind

default:"ExhaustiveKnn"

Vector search algorithm: ExhaustiveKnn or Hnsw

compressionType

KnownVectorSearchCompressionKind

Vector compression: BinaryQuantization or ScalarQuantization

Field Configuration

idFieldKey

string

default:"id"

Field name for document IDs

chunkFieldKey

string

default:"chunk"

Field name for text content

embeddingFieldKey

string

default:"embedding"

Field name for embedding vectors

metadataStringFieldKey

string

default:"metadata"

Field name for metadata JSON string

docIdFieldKey

string

default:"doc_id"

Field name for document reference IDs

hiddenFieldKeys

string[]

default:[]

List of fields to hide from retrieval results

filterableMetadataFieldKeys

Array | Map

Metadata fields that can be filtered. Can be:

Array of field names: ["author", "category"]
Map of field to name: {author: "author", topic: "theme"}
Map with types: {author: ["author", MetadataIndexFieldType.STRING]}

Configuration

Environment Variables

AZURE_AI_SEARCH_ENDPOINT=https://your-service.search.windows.net
AZURE_AI_SEARCH_KEY=your-admin-key
AZURE_SEARCH_API_VERSION=2024-09-01-preview

Using Azure Identity

import { DefaultAzureCredential } from "@azure/identity";
import { AzureAISearchVectorStore, IndexManagement } from "@llamaindex/azure";

const credential = new DefaultAzureCredential();

const vectorStore = new AzureAISearchVectorStore({
  endpoint: "https://your-service.search.windows.net",
  credential,
  indexName: "my-index",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS
});

Using Managed Identity

import { ManagedIdentityCredential } from "@azure/identity";

const credential = new ManagedIdentityCredential(
  process.env.AZURE_CLIENT_ID
);

const vectorStore = new AzureAISearchVectorStore({
  endpoint: "https://your-service.search.windows.net",
  credential,
  indexName: "my-index",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS
});

Index Management

Auto-Create Index

import { AzureAISearchVectorStore, IndexManagement, MetadataIndexFieldType } from "@llamaindex/azure";

const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "documents",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS,
  embeddingDimensionality: 1536,
  filterableMetadataFieldKeys: {
    author: "author",
    category: ["category", MetadataIndexFieldType.STRING],
    year: ["year", MetadataIndexFieldType.INT32]
  }
});

Vector Algorithm Configuration

import { KnownVectorSearchAlgorithmKind, KnownVectorSearchCompressionKind } from "@azure/search-documents";

// HNSW with binary quantization
const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "hnsw-index",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS,
  vectorAlgorithmType: KnownVectorSearchAlgorithmKind.Hnsw,
  compressionType: KnownVectorSearchCompressionKind.BinaryQuantization,
  embeddingDimensionality: 1536
});

Query Modes

Azure AI Search supports multiple query modes:

Vector Search (Default)

import { VectorStoreQueryMode } from "@llamaindex/core/vector-store";

const retriever = index.asRetriever({
  similarityTopK: 5
});
const nodes = await retriever.retrieve("query text");

Hybrid Search

Combines vector and full-text search:

const response = await index.asQueryEngine().query({
  query: "What is Azure AI Search?",
  queryStr: "Azure AI Search",  // Required for hybrid search
  mode: VectorStoreQueryMode.HYBRID
});

Semantic Hybrid Search

Adds semantic ranking to hybrid search:

const response = await index.asQueryEngine().query({
  query: "What is Azure AI Search?",
  queryStr: "Azure AI Search",
  mode: VectorStoreQueryMode.SEMANTIC_HYBRID
});

Sparse Search

Full-text search only:

const response = await index.asQueryEngine().query({
  query: "What is Azure AI Search?",
  queryStr: "Azure AI Search",
  mode: VectorStoreQueryMode.SPARSE
});

Metadata Filtering

import { MetadataFilters, FilterCondition, FilterOperator } from "@llamaindex/core/vector-store";
import { MetadataIndexFieldType } from "@llamaindex/azure";

const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "filtered-docs",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS,
  filterableMetadataFieldKeys: {
    category: "category",
    author: "author",
    tags: ["tags", MetadataIndexFieldType.COLLECTION]
  }
});

const documents = [
  new Document({
    text: "Doc 1",
    metadata: { category: "tech", author: "John", tags: ["ai", "ml"] }
  }),
  new Document({
    text: "Doc 2",
    metadata: { category: "science", author: "Jane" }
  })
];

const index = await VectorStoreIndex.fromDocuments(documents, {
  storageContext: { vectorStore }
});

const retriever = index.asRetriever({
  filters: new MetadataFilters({
    filters: [
      { key: "category", value: "tech", operator: FilterOperator.EQ },
      { key: "tags", value: ["ai"], operator: FilterOperator.IN }
    ],
    condition: FilterCondition.AND
  })
});

const nodes = await retriever.retrieve("query");

Supported Filter Operators

Azure AI Search supports:

EQ - Equal
IN - Value in array

Managing Data

Add Documents

const newDoc = new Document({
  text: "New content",
  metadata: { source: "api" }
});
await index.insert(newDoc);

Delete by Document ID

await vectorStore.delete(refDocId);

Get Nodes

// Get specific nodes by ID
const nodes = await vectorStore.getNodes(["id1", "id2"]);

// Get nodes with filters
const filteredNodes = await vectorStore.getNodes(
  undefined,
  new MetadataFilters({
    filters: [{ key: "category", value: "tech", operator: FilterOperator.EQ }]
  })
);

// Get with limit
const limitedNodes = await vectorStore.getNodes(undefined, undefined, 10);

Complete Example

import { AzureAISearchVectorStore, IndexManagement, MetadataIndexFieldType } from "@llamaindex/azure";
import { VectorStoreIndex, Document, Settings } from "llamaindex";
import { OpenAI, OpenAIEmbedding } from "@llamaindex/openai";
import { KnownVectorSearchAlgorithmKind, KnownAnalyzerNames } from "@azure/search-documents";

// Configure settings
Settings.llm = new OpenAI({ model: "gpt-4" });
Settings.embedModel = new OpenAIEmbedding({ model: "text-embedding-3-small" });

// Create vector store with full configuration
const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "technical-docs",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS,
  embeddingDimensionality: 1536,
  vectorAlgorithmType: KnownVectorSearchAlgorithmKind.Hnsw,
  languageAnalyzer: KnownAnalyzerNames.EnLucene,
  hiddenFieldKeys: ["embedding"],
  filterableMetadataFieldKeys: {
    author: "author",
    category: ["category", MetadataIndexFieldType.STRING],
    year: ["year", MetadataIndexFieldType.INT32]
  }
});

// Load documents
const documents = [
  new Document({
    text: "Azure AI Search provides vector search capabilities...",
    metadata: { author: "Microsoft", category: "cloud", year: 2024 }
  }),
  new Document({
    text: "LlamaIndex integrates with Azure services...",
    metadata: { author: "LlamaIndex", category: "integration", year: 2024 }
  })
];

// Build index
const index = await VectorStoreIndex.fromDocuments(documents, {
  storageContext: { vectorStore }
});

// Query with hybrid search and filters
const response = await index.asQueryEngine().query({
  query: "Azure vector search",
  queryStr: "Azure vector search",
  mode: VectorStoreQueryMode.HYBRID,
  filters: new MetadataFilters({
    filters: [
      { key: "category", value: "cloud", operator: FilterOperator.EQ }
    ]
  })
});

console.log(response.response);

Best Practices

Use HNSW for production: Better performance than exhaustive KNN
Enable compression: Reduce storage costs with quantization
Index only necessary metadata: Minimize index size
Use hybrid search: Combine vector and text for better results
Monitor costs: Track search units and storage usage
Implement retry logic: Handle transient failures
Use managed identity: More secure than API keys

Troubleshooting

Index Not Found

// Use auto-create
const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "my-index",
  indexManagement: IndexManagement.CREATE_IF_NOT_EXISTS
});

Authentication Failed

Verify your credentials:

import { SearchIndexClient, AzureKeyCredential } from "@azure/search-documents";

try {
  const client = new SearchIndexClient(
    process.env.AZURE_AI_SEARCH_ENDPOINT!,
    new AzureKeyCredential(process.env.AZURE_AI_SEARCH_KEY!)
  );
  const indexes = await client.listIndexes();
  console.log("Connected successfully");
} catch (error) {
  console.error("Authentication error:", error);
}

Dimension Mismatch

Ensure embedding dimensions match:

import { OpenAIEmbedding } from "@llamaindex/openai";

// text-embedding-3-small: 1536 dimensions
const embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small"
});

// Azure index must match
const vectorStore = new AzureAISearchVectorStore({
  endpoint: process.env.AZURE_AI_SEARCH_ENDPOINT,
  key: process.env.AZURE_AI_SEARCH_KEY,
  indexName: "my-index",
  embeddingDimensionality: 1536
});

Core Package

Main Package

LLM Providers

Vector Stores

Workflow & Tools

Overview

Installation

Basic Usage

Constructor Options

Authentication

Index Configuration

Field Configuration

Configuration

Environment Variables

Using Azure Identity

Using Managed Identity

Index Management

Auto-Create Index

Vector Algorithm Configuration

Query Modes

Vector Search (Default)

Hybrid Search

Semantic Hybrid Search

Sparse Search

Metadata Filtering

Supported Filter Operators

Managing Data

Add Documents

Delete by Document ID

Get Nodes

Complete Example

Best Practices

Troubleshooting

Index Not Found

Authentication Failed

Dimension Mismatch

See Also

Build docs developers (and LLMs) love

Core Package

Main Package

LLM Providers

Vector Stores

Workflow & Tools

​Overview

​Installation

​Basic Usage

​Constructor Options

​Authentication

​Index Configuration

​Field Configuration

​Configuration

​Environment Variables

​Using Azure Identity

​Using Managed Identity

​Index Management

​Auto-Create Index

​Vector Algorithm Configuration

​Query Modes

​Vector Search (Default)

​Hybrid Search

​Semantic Hybrid Search

​Sparse Search

​Metadata Filtering

​Supported Filter Operators

​Managing Data

​Add Documents

​Delete by Document ID

​Get Nodes

​Complete Example

​Best Practices

​Troubleshooting

​Index Not Found

​Authentication Failed

​Dimension Mismatch

​See Also

Build docs developers (and LLMs) love

Overview

Installation

Basic Usage

Constructor Options

Authentication

Index Configuration

Field Configuration

Configuration

Environment Variables

Using Azure Identity

Using Managed Identity

Index Management

Auto-Create Index

Vector Algorithm Configuration

Query Modes

Vector Search (Default)

Hybrid Search

Semantic Hybrid Search

Sparse Search

Metadata Filtering

Supported Filter Operators

Managing Data

Add Documents

Delete by Document ID

Get Nodes

Complete Example

Best Practices

Troubleshooting

Index Not Found

Authentication Failed

Dimension Mismatch

See Also