Indexers in Azure AI Search
Indexers are crawlers that extract searchable data from supported Azure data sources and populate search indexes automatically.What are Indexers?
Indexers provide:- Automated ingestion: Pull data from supported sources
- Field mapping: Map source fields to index fields
- Change detection: Incremental updates
- Scheduling: Periodic refresh (as frequent as every 5 minutes)
- AI enrichment: Apply skillsets for transformation
Supported Data Sources
Azure Blob Storage
Index documents from blob containers
Azure Cosmos DB
Index from NoSQL, MongoDB, Gremlin
Azure SQL
Index from SQL Database and Managed Instance
SharePoint Online
Index documents and sites (preview)
OneLake
Index from Microsoft Fabric lakehouses
Azure Table Storage
Index from Table Storage
Indexer Workflow
Stages
- Document cracking: Open files and extract content
- Field mapping: Map source to destination fields
- Skillset execution: Apply AI skills (optional)
- Output field mapping: Map skill outputs to index fields
Create an Indexer
1. Create Data Source
2. Create Indexer
Scheduling
Run indexers on a schedule:- Minimum: PT5M (5 minutes)
- Maximum: P1D (1 day)
- Format: ISO 8601 duration
Field Mappings
Map source fields to index fields:base64Encode/base64DecodeextractTokenAtPositionjsonArrayToStringCollectionurlEncode/urlDecode
AI Enrichment with Skillsets
Apply AI transformations during indexing:Monitoring
Track indexer execution:- Status: Success, Failed, InProgress
- Execution history: Past runs and outcomes
- Error details: Failed document information
- Metrics: Documents processed, latency
Change Detection
Indexers detect and process only changed documents:- Azure SQL: High water mark change detection
- Cosmos DB:
_tstimestamp - Blob Storage: Last modified date
Best Practices
Batch Size
Batch Size
Adjust batch size based on document complexity and size
Error Handling
Error Handling
Configure
maxFailedItems and maxFailedItemsPerBatch tolerancesScheduling
Scheduling
Balance freshness needs with resource utilization
Monitoring
Monitoring
Set up alerts for indexer failures
Next Steps
Skillsets
Add AI enrichment
Blob Indexing
Index from blob storage