Skip to main content

Azure AI Search

Azure AI Search is a fully managed, cloud-hosted service that connects your data to AI. The service unifies access to enterprise and web content so agents and LLMs can use context, chat history, and multi-source signals to produce reliable, grounded answers.

Classic Search

Traditional search with full-text, vector, and hybrid queries

Agentic Retrieval

LLM-assisted multi-query retrieval for agent workflows

AI Enrichment

Extract and structure content with AI processing

Enterprise Ready

Security, compliance, and scale for production workloads
Common use cases include classic search for traditional search applications and agentic retrieval for modern retrieval-augmented generation (RAG) scenarios. This makes Azure AI Search suitable for both enterprise and consumer scenarios.

Key Capabilities

When you create a search service, you unlock:
  • Classic search for single requests
  • Agentic retrieval for parallel, iterative, LLM-assisted search
  • Full-text search with BM25 ranking
  • Vector search with similarity matching
  • Hybrid search combining text and vectors
  • Multimodal queries over text and images

Ground AI Responses

Provide agents and chatbots with accurate, context-aware responses grounded in your data.

Multi-Source Access

Connect to Azure Blob Storage, Cosmos DB, SharePoint, OneLake, and more.

Intelligent Processing

Enrich content with AI skills for chunking, embedding, and transformation.

Hybrid Search

Combine full-text and vector search to balance precision and recall.

Multimodal Search

Query content containing both text and images in a single pipeline.

Enterprise Security

Implement document-level access control, private networks, and compliance.
Classic search is an index-first retrieval model for predictable, low-latency queries.

How It Works

1

Create an Index

Define the schema with fields, data types, and attributes.
{
  "name": "products-index",
  "fields": [
    {"name": "id", "type": "Edm.String", "key": true},
    {"name": "title", "type": "Edm.String", "searchable": true},
    {"name": "description", "type": "Edm.String", "searchable": true},
    {"name": "category", "type": "Edm.String", "filterable": true},
    {"name": "price", "type": "Edm.Double", "filterable": true},
    {"name": "vector", "type": "Collection(Edm.Single)", "dimensions": 1536}
  ]
}
2

Load Content

Use push or pull methods to populate the index.Push Method (direct upload):
from azure.search.documents import SearchClient

search_client = SearchClient(endpoint, index_name, credential)

documents = [
    {
        "id": "1",
        "title": "Azure AI Search",
        "description": "Powerful search service",
        "category": "AI Services",
        "price": 0.0
    }
]

result = search_client.upload_documents(documents)
Pull Method (indexer):
from azure.search.documents.indexes import SearchIndexerClient
from azure.search.documents.indexes.models import (
    SearchIndexer,
    SearchIndexerDataSourceConnection
)

# Create data source
data_source = SearchIndexerDataSourceConnection(
    name="myblob-datasource",
    type="azureblob",
    connection_string="DefaultEndpointsProtocol=https;...",
    container=SearchIndexerDataContainer(name="documents")
)

# Create indexer
indexer = SearchIndexer(
    name="myblob-indexer",
    data_source_name="myblob-datasource",
    target_index_name="products-index"
)

indexer_client.create_or_update_indexer(indexer)
3

Query the Index

Execute searches with various query types.
# Full-text search
results = search_client.search(
    search_text="machine learning",
    select=["title", "description"],
    top=10
)

# Vector search
results = search_client.search(
    vector_queries=[VectorizedQuery(
        vector=query_embedding,
        k_nearest_neighbors=5,
        fields="vector"
    )]
)

# Hybrid search
results = search_client.search(
    search_text="AI services",
    vector_queries=[VectorizedQuery(
        vector=query_embedding,
        k_nearest_neighbors=5,
        fields="vector"
    )],
    top=10
)

Query Types

Agentic Retrieval

Agentic retrieval is a multi-query pipeline designed for complex agent-to-agent workflows.

Knowledge Bases

A knowledge base represents a complete domain of knowledge:
from azure.search.documents.indexes.models import (
    SearchIndex,
    KnowledgeBase,
    KnowledgeSource
)

# Create knowledge base
kb = KnowledgeBase(
    name="company-knowledge",
    description="Corporate documentation and policies",
    knowledge_sources=[
        KnowledgeSource(
            name="sharepoint-docs",
            type="sharepoint",
            connection_string="...",
            site_url="https://company.sharepoint.com"
        ),
        KnowledgeSource(
            name="azure-storage",
            type="azureblob",
            connection_string="...",
            container_name="documents"
        )
    ],
    reasoning_effort="medium",  # low, medium, high
    include_citations=True
)

kb_client.create_or_update(kb)

Query Flow

1

Planning

LLM analyzes the query and creates a retrieval plan.
2

Decomposition

Break complex queries into focused subqueries.
3

Parallel Retrieval

Execute subqueries across multiple knowledge sources simultaneously.
4

Semantic Reranking

Apply semantic understanding to improve result quality.
5

Results Merging

Combine and deduplicate results from all sources.
6

Response Generation

Return answer, sources, and activity log optimized for agents.

Agent Integration

from azure.ai.projects import AIProjectClient

# Query knowledge base through agent
agent = project_client.agents.create(
    model="gpt-4",
    instructions="Answer questions using company knowledge.",
    tools=[{
        "type": "knowledge_base",
        "knowledge_base_id": kb.id
    }]
)

thread = project_client.agents.create_thread()
message = project_client.agents.create_message(
    thread.id,
    "user",
    "What is our return policy?"
)

run = project_client.agents.create_run(thread.id, agent.id)
response = project_client.agents.wait_for_run(thread.id, run.id)

# Response includes:
# - Grounded answer
# - Source citations
# - Activity log
# - Confidence scores

AI Enrichment

AI enrichment uses skills to extract and transform content during indexing:

Built-in Skills

Text Skills

  • Text splitting (chunking)
  • Language detection
  • Key phrase extraction
  • Entity recognition
  • Sentiment analysis
  • PII detection

Vision Skills

  • OCR (text extraction)
  • Image analysis
  • Object detection
  • Brand detection
  • Face detection
  • Handwriting recognition

AI Skills

  • Azure OpenAI embeddings
  • Multimodal embeddings
  • Text translation
  • Custom models

Utility Skills

  • Conditional logic
  • Document extraction
  • Shaper (structure data)
  • Merge fields

Skillset Example

from azure.search.documents.indexes.models import (
    SearchIndexerSkillset,
    SplitSkill,
    AzureOpenAIEmbeddingSkill,
    EntityRecognitionSkill
)

skillset = SearchIndexerSkillset(
    name="document-enrichment",
    description="Extract and vectorize content",
    skills=[
        # Split text into chunks
        SplitSkill(
            context="/document",
            text_split_mode="pages",
            maximum_page_length=2000,
            page_overlap_length=500,
            inputs=[{"name": "text", "source": "/document/content"}],
            outputs=[{"name": "textItems", "target_name": "chunks"}]
        ),
        # Generate embeddings
        AzureOpenAIEmbeddingSkill(
            context="/document/chunks/*",
            resource_uri="https://your-openai.openai.azure.com",
            deployment_id="text-embedding-ada-002",
            inputs=[{"name": "text", "source": "/document/chunks/*"}],
            outputs=[{"name": "embedding", "target_name": "vector"}]
        ),
        # Extract entities
        EntityRecognitionSkill(
            context="/document",
            categories=["Person", "Organization", "Location"],
            inputs=[{"name": "text", "source": "/document/content"}],
            outputs=[{"name": "entities", "target_name": "entities"}]
        )
    ]
)

indexer_client.create_or_update_skillset(skillset)

Integrated Vectorization

Automate embedding generation during indexing:
# Configure vectorizer
from azure.search.documents.indexes.models import (
    AzureOpenAIVectorizer,
    VectorSearch,
    VectorSearchProfile
)

vectorizer = AzureOpenAIVectorizer(
    name="my-vectorizer",
    azure_open_ai_parameters={
        "resource_uri": "https://your-openai.openai.azure.com",
        "deployment_id": "text-embedding-ada-002",
        "api_key": "your-key"
    }
)

# Add to index
index = SearchIndex(
    name="auto-vectorized-index",
    fields=[
        SimpleField(name="id", type="Edm.String", key=True),
        SearchableField(name="content", type="Edm.String"),
        SearchField(
            name="content_vector",
            type="Collection(Edm.Single)",
            vector_search_dimensions=1536,
            vector_search_profile_name="my-profile"
        )
    ],
    vector_search=VectorSearch(
        profiles=[VectorSearchProfile(
            name="my-profile",
            vectorizer_name="my-vectorizer"
        )],
        vectorizers=[vectorizer]
    )
)

Security Features

Document-Level Security

Implement fine-grained access control:
# Index documents with security fields
documents = [
    {
        "id": "doc1",
        "content": "Confidential information",
        "security_filter": ["group1", "user123"]
    }
]

search_client.upload_documents(documents)

# Query with security filter
user_groups = ["group1", "group2"]
filter_expression = " or ".join(
    [f"security_filter/any(g: g eq '{group}')" for group in user_groups]
)

results = search_client.search(
    search_text="confidential",
    filter=filter_expression
)

Network Security

Connect to search service over private network:
  • Azure Private Link integration
  • No public internet exposure
  • Network traffic stays on Azure backbone
  • Compatible with VNet peering

Monitoring and Optimization

Search Analytics

Track usage patterns and optimize:
from azure.monitor.query import LogsQueryClient

logs_client = LogsQueryClient(credential)

# Query search logs
query = """
AzureDiagnostics
| where ResourceType == "SEARCHSERVICES"
| where OperationName == "Query.Search"
| summarize 
    QueryCount = count(),
    AvgDuration = avg(DurationMs),
    AvgResultCount = avg(ResultCount)
  by SearchText = Query_s
| order by QueryCount desc
| take 20
"""

response = logs_client.query_workspace(
    workspace_id="your-workspace-id",
    query=query,
    timespan=timedelta(days=7)
)

Performance Tuning

Improve search result quality:
  • Scoring Profiles: Boost fields or apply functions
  • Synonym Maps: Handle terminology variations
  • Custom Analyzers: Language-specific tokenization
  • Semantic Ranking: Deep learning re-ranking
Optimize for throughput and storage:
  • Replicas: Handle more queries per second
  • Partitions: Store more documents
  • Auto-scaling: Adjust capacity based on load
  • Index Optimization: Reduce field count and analyzers

Pricing Tiers

TierStorageReplicasPartitionsUse Case
Free50 MB11Development and testing
Basic2 GB31Small production workloads
Standard S125 GB1212Most production scenarios
Standard S2100 GB1212Larger datasets
Standard S3200 GB1212High-volume queries
Storage Optimized1-2 TB1212Large document collections
Pricing is based on search units (replicas × partitions). Free tier includes 10,000 documents and 50 MB storage.

Getting Started

1

Create Search Service

Provision a search service in the Azure portal or via CLI.
2

Define Index Schema

Create an index with fields matching your data structure.
3

Load Data

Use indexers or push API to populate the index.
4

Query and Test

Use Search Explorer or SDK to test queries.
5

Integrate

Add search to your application or agent.

Resources

Quickstart

Create your first search index

RAG Tutorial

Build a RAG application

REST API Reference

Complete API documentation

Vector Search Guide

Implement vector search

Build docs developers (and LLMs) love