Skip to main content
GraphRAG seamlessly integrates with Azure OpenAI to provide enterprise-grade LLM capabilities. This guide walks you through setting up Azure OpenAI for use with GraphRAG.

Prerequisites

Before starting, ensure you have:
  • An active Azure subscription
  • Access to Azure OpenAI Service
  • Deployed models for chat completion and embeddings
  • GraphRAG installed (pip install graphrag)

Azure OpenAI setup

1

Deploy Azure OpenAI models

In the Azure Portal, deploy the required models:
  1. Navigate to your Azure OpenAI resource
  2. Go to “Model deployments”
  3. Deploy a chat model (e.g., gpt-4o, gpt-4-turbo)
  4. Deploy an embedding model (e.g., text-embedding-3-small)
Note your deployment names - you’ll need these for configuration.
2

Gather connection information

Collect the following from your Azure OpenAI resource:
  • Endpoint URL: https://YOUR-RESOURCE-NAME.openai.azure.com/
  • API Key: Found under “Keys and Endpoint”
  • API Version: Use 2024-02-15-preview or later
  • Deployment names: The names you gave your models
3

Initialize GraphRAG project

graphrag init --root ./my-project

Configuration

Environment variables

The recommended approach is to store sensitive information in environment variables.
1

Edit .env file

Open the .env file created by graphrag init:
.env
GRAPHRAG_API_KEY=your-azure-api-key-here
GRAPHRAG_API_BASE=https://your-resource-name.openai.azure.com/
GRAPHRAG_API_VERSION=2024-02-15-preview
Never commit your .env file to version control. Keep your API keys secure.
2

Configure settings.yaml

Edit settings.yaml to use Azure OpenAI:
settings.yaml
name: "my-azure-project"

llm:
  type: chat
  model_provider: azure
  model: gpt-4o  # Your chat deployment name
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}
  deployment_name: gpt-4o  # Must match your Azure deployment

embedding:
  type: embedding
  model_provider: azure
  model: text-embedding-3-small  # Your embedding deployment name
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}
  deployment_name: text-embedding-3-small  # Must match your Azure deployment

Rate limiting

Azure OpenAI has token-per-minute (TPM) and requests-per-minute (RPM) limits. Configure these in your settings:
settings.yaml
llm:
  type: chat
  model_provider: azure
  model: gpt-4o
  deployment_name: gpt-4o
  # Rate limits (adjust based on your Azure quota)
  requests_per_minute: 60
  tokens_per_minute: 80000
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}

embedding:
  type: embedding
  model_provider: azure
  model: text-embedding-3-small
  deployment_name: text-embedding-3-small
  # Rate limits for embeddings
  requests_per_minute: 60
  tokens_per_minute: 150000
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}
Check your Azure OpenAI quota in the Azure Portal under your resource’s “Quotas” section.

Model configurations

Chat models

Different Azure OpenAI chat models have different capabilities and costs:
Latest and most capable model:
llm:
  type: chat
  model_provider: azure
  model: gpt-4o
  deployment_name: gpt-4o
  max_tokens: 4000
  temperature: 0.0
  top_p: 1.0
Best for: Complex reasoning, high-quality extractions

Embedding models

Balanced performance and cost:
embedding:
  type: embedding
  model_provider: azure
  model: text-embedding-3-small
  deployment_name: text-embedding-3-small
Dimensions: 1536

Multiple model configurations

You can use different models for different workflows:
settings.yaml
# Default chat model for most workflows
llm:
  type: chat
  model_provider: azure
  model: gpt-4o
  deployment_name: gpt-4o
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}

# Override for specific workflows
entity_extraction:
  llm:
    type: chat
    model_provider: azure
    model: gpt-4-turbo
    deployment_name: gpt-4-turbo
    # Use same API settings
    api_base: ${GRAPHRAG_API_BASE}
    api_key: ${GRAPHRAG_API_KEY}
    api_version: ${GRAPHRAG_API_VERSION}

summarize_descriptions:
  llm:
    type: chat
    model_provider: azure
    model: gpt-35-turbo
    deployment_name: gpt-35-turbo-16k
    # Cost optimization for summaries
    api_base: ${GRAPHRAG_API_BASE}
    api_key: ${GRAPHRAG_API_KEY}
    api_version: ${GRAPHRAG_API_VERSION}

Azure AI Search integration

For production deployments, use Azure AI Search as your vector store:
settings.yaml
vector_store:
  type: azure_ai_search
  api_key: ${AZURE_SEARCH_API_KEY}
  url: https://your-search-service.search.windows.net
  audience: https://search.azure.com
  
  # Optional: custom index schema
  index_schema:
    entity_description:
      index_name: graphrag-entities
    community_full_content:
      index_name: graphrag-communities
    text_unit_text:
      index_name: graphrag-text-units
1

Create Azure AI Search resource

In the Azure Portal, create a new Azure AI Search service.
2

Get connection details

  • URL: https://YOUR-SERVICE-NAME.search.windows.net
  • API Key: Found in “Keys” section
3

Add to .env

.env
AZURE_SEARCH_API_KEY=your-search-api-key

Testing your configuration

Verify your Azure setup before running a full index:
1

Run dry run

graphrag index --root ./my-project --dry-run --verbose
This validates your configuration without making API calls.
2

Test with small dataset

Create a small test file:
echo "This is a test document for GraphRAG." > ./my-project/input/test.txt
Run indexing:
graphrag index --root ./my-project --verbose
3

Monitor Azure metrics

Check the Azure Portal to verify:
  • API calls are succeeding
  • Token usage is within limits
  • No throttling errors

Cost optimization

Optimize costs when using Azure OpenAI:

1. Use appropriate models

Development

Use GPT-3.5 Turbo for testing and development

Production

Use GPT-4o or GPT-4 Turbo for final production runs

2. Enable caching

settings.yaml
cache:
  type: file
  base_dir: ./cache
Caching stores LLM responses to avoid redundant API calls.

3. Optimize chunking

settings.yaml
chunking:
  size: 300  # Larger chunks = fewer LLM calls
  overlap: 100
  encoding_model: cl100k_base

4. Batch processing

settings.yaml
entity_extraction:
  max_gleanings: 0  # Reduce to 0-1 for cost savings
Reducing max_gleanings will decrease extraction quality. Test with your data to find the right balance.

Authentication methods

API key authentication

The standard method shown above:
.env
GRAPHRAG_API_KEY=your-azure-api-key

Azure AD / Entra ID authentication

For enhanced security, use Azure AD:
settings.yaml
llm:
  type: chat
  model_provider: azure
  model: gpt-4o
  deployment_name: gpt-4o
  api_base: ${GRAPHRAG_API_BASE}
  api_version: ${GRAPHRAG_API_VERSION}
  # Use Azure AD token
  authentication_type: azure_ad
Azure AD authentication requires the azure-identity Python package: pip install azure-identity

Troubleshooting

Common issues

Cause: Invalid API key or authentication failureSolution:
  • Verify your API key in the Azure Portal
  • Ensure the key is correctly set in .env
  • Check that api_base matches your resource endpoint
Cause: Incorrect deployment name or endpointSolution:
  • Verify deployment_name matches your Azure deployment exactly
  • Check that api_base ends with / or doesn’t, based on API version
  • Ensure the model is deployed in the same region as your endpoint
Cause: Exceeding Azure OpenAI quotaSolution:
  • Reduce requests_per_minute and tokens_per_minute in config
  • Request quota increase in Azure Portal
  • Implement retry logic with exponential backoff
Cause: Model not deployed or incorrect API versionSolution:
  • Deploy the model in Azure Portal
  • Use API version 2024-02-15-preview or later
  • Match model parameter to your deployment name

Validation checklist

  • Azure OpenAI resource is created and active
  • Chat model (e.g., gpt-4o) is deployed
  • Embedding model (e.g., text-embedding-3-small) is deployed
  • API key is copied to .env file
  • api_base URL is correct
  • deployment_name matches Azure deployment exactly
  • model_provider is set to azure
  • Rate limits match your Azure quota
  • Dry run completes without errors

Example complete configuration

Here’s a complete working configuration for Azure:
name: "azure-graphrag-project"

logging:
  directory: "output/logs"
  filename: "app.log"

llm:
  type: chat
  model_provider: azure
  model: gpt-4o
  deployment_name: gpt-4o
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}
  requests_per_minute: 60
  tokens_per_minute: 80000
  max_tokens: 4000
  temperature: 0.0
  top_p: 1.0

embedding:
  type: embedding
  model_provider: azure
  model: text-embedding-3-small
  deployment_name: text-embedding-3-small
  api_base: ${GRAPHRAG_API_BASE}
  api_key: ${GRAPHRAG_API_KEY}
  api_version: ${GRAPHRAG_API_VERSION}
  requests_per_minute: 60
  tokens_per_minute: 150000

chunking:
  size: 300
  overlap: 100
  encoding_model: cl100k_base

cache:
  type: file
  base_dir: ./cache

snapshots:
  graphml: true

storage:
  type: file
  base_dir: ./output

vector_store:
  type: azure_ai_search
  api_key: ${AZURE_SEARCH_API_KEY}
  url: ${AZURE_SEARCH_URL}

Next steps

Configuration reference

Explore all configuration options

CLI usage

Learn GraphRAG commands

Best practices

Optimize your implementation

Migration guide

Upgrade between versions

Build docs developers (and LLMs) love