Azure deployment

This guide walks you through deploying GraphRAG using Azure services, including Azure OpenAI for language models and Azure Storage for scalable data management.

Why Azure?

Azure provides enterprise-grade features for GraphRAG deployments:

Managed OpenAI - No API key rotation, enterprise SLAs
Scalable storage - Blob Storage and Cosmos DB integration
Security - Managed identities, VNet integration, private endpoints
Compliance - Meet regulatory requirements with Azure’s certifications
Cost management - Detailed billing and budget controls

Prerequisites

Azure subscription

Ensure you have an active Azure subscription with appropriate permissions.

Azure CLI

Install the Azure CLI:

brew install azure-cli

Authenticate with your Azure account:

az login
az account set --subscription "Your Subscription Name"

Set up Azure OpenAI

Create Azure OpenAI resource

Create an Azure OpenAI service instance:

az cognitiveservices account create \
  --name graphrag-openai \
  --resource-group graphrag-resources \
  --location eastus \
  --kind OpenAI \
  --sku S0 \
  --custom-domain graphrag-openai

Deploy models

Deploy the required models (chat and embeddings):

# Deploy GPT-4 for chat
az cognitiveservices account deployment create \
  --name graphrag-openai \
  --resource-group graphrag-resources \
  --deployment-name gpt-4-deployment \
  --model-name gpt-4 \
  --model-version "0613" \
  --model-format OpenAI \
  --sku-capacity 10 \
  --sku-name "Standard"

# Deploy text-embedding-3-small for embeddings
az cognitiveservices account deployment create \
  --name graphrag-openai \
  --resource-group graphrag-resources \
  --deployment-name embedding-deployment \
  --model-name text-embedding-3-small \
  --model-version "1" \
  --model-format OpenAI \
  --sku-capacity 10 \
  --sku-name "Standard"

Retrieve endpoint and key

Get your Azure OpenAI endpoint and API key:

# Get endpoint
az cognitiveservices account show \
  --name graphrag-openai \
  --resource-group graphrag-resources \
  --query "properties.endpoint" \
  --output tsv

# Get API key
az cognitiveservices account keys list \
  --name graphrag-openai \
  --resource-group graphrag-resources \
  --query "key1" \
  --output tsv

Configure Azure Storage

Create storage account

Create an Azure Storage account for your data:

az storage account create \
  --name graphragstorage \
  --resource-group graphrag-resources \
  --location eastus \
  --sku Standard_LRS \
  --kind StorageV2

Create containers

Create containers for input and output data:

# Get connection string
CONNECTION_STRING=$(az storage account show-connection-string \
  --name graphragstorage \
  --resource-group graphrag-resources \
  --query "connectionString" \
  --output tsv)

# Create containers
az storage container create \
  --name graphrag-input \
  --connection-string "$CONNECTION_STRING"

az storage container create \
  --name graphrag-output \
  --connection-string "$CONNECTION_STRING"

Configure GraphRAG for Azure

Update environment variables

Create or update your .env file:

.env

GRAPHRAG_API_KEY=your-azure-openai-api-key
AZURE_STORAGE_CONNECTION_STRING=your-storage-connection-string

Configure settings.yaml

Update settings.yaml with Azure-specific configuration:

settings.yaml

# Azure OpenAI Configuration
completion_models:
  default_completion_model:
    type: chat
    model_provider: azure
    model: gpt-4
    deployment_name: gpt-4-deployment
    api_base: https://graphrag-openai.openai.azure.com
    api_version: 2024-02-15-preview
    api_key: ${GRAPHRAG_API_KEY}

embedding_models:
  default_embedding_model:
    type: embedding
    model_provider: azure
    model: text-embedding-3-small
    deployment_name: embedding-deployment
    api_base: https://graphrag-openai.openai.azure.com
    api_version: 2024-02-15-preview
    api_key: ${GRAPHRAG_API_KEY}

# Azure Blob Storage for Input
input:
  storage:
    type: blob
    connection_string: ${AZURE_STORAGE_CONNECTION_STRING}
    container_name: graphrag-input
  type: text
  file_pattern: .*\.txt$

# Azure Blob Storage for Output
output:
  type: blob
  connection_string: ${AZURE_STORAGE_CONNECTION_STRING}
  container_name: graphrag-output

# Optional: Azure Blob Storage for Cache
cache:
  type: json
  storage:
    type: blob
    connection_string: ${AZURE_STORAGE_CONNECTION_STRING}
    container_name: graphrag-cache

Using managed identity (recommended)

For production deployments, use Azure Managed Identity instead of API keys:

Create managed identity

Create a user-assigned managed identity:

az identity create \
  --name graphrag-identity \
  --resource-group graphrag-resources

Grant permissions

Assign the managed identity to Azure OpenAI:

# Get principal ID
PRINCIPAL_ID=$(az identity show \
  --name graphrag-identity \
  --resource-group graphrag-resources \
  --query "principalId" \
  --output tsv)

# Assign Cognitive Services User role
az role assignment create \
  --role "Cognitive Services User" \
  --assignee "$PRINCIPAL_ID" \
  --scope "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.CognitiveServices/accounts/graphrag-openai"

# Assign Storage Blob Data Contributor role
az role assignment create \
  --role "Storage Blob Data Contributor" \
  --assignee "$PRINCIPAL_ID" \
  --scope "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.Storage/storageAccounts/graphragstorage"

Update configuration

Modify settings.yaml to use managed identity:

completion_models:
  default_completion_model:
    type: chat
    model_provider: azure
    model: gpt-4
    deployment_name: gpt-4-deployment
    api_base: https://graphrag-openai.openai.azure.com
    api_version: 2024-02-15-preview
    auth_method: azure_managed_identity  # Use managed identity
    # Remove api_key line

Authenticate Azure CLI

az login --identity

Deploy to Azure Container Instances

Run GraphRAG in Azure Container Instances for scheduled indexing:

Create Dockerfile

Dockerfile

FROM python:3.11-slim

WORKDIR /app

RUN pip install graphrag

COPY settings.yaml .
COPY .env .

CMD ["graphrag", "index", "--root", "/app"]

Build and push image

Build and push to Azure Container Registry:

# Create container registry
az acr create \
  --name graphragregistry \
  --resource-group graphrag-resources \
  --sku Basic

# Build and push
az acr build \
  --registry graphragregistry \
  --image graphrag:latest .

Deploy to ACI

Create a container instance:

az container create \
  --resource-group graphrag-resources \
  --name graphrag-indexer \
  --image graphragregistry.azurecr.io/graphrag:latest \
  --cpu 2 \
  --memory 4 \
  --registry-login-server graphragregistry.azurecr.io \
  --registry-username $(az acr credential show --name graphragregistry --query username -o tsv) \
  --registry-password $(az acr credential show --name graphragregistry --query passwords[0].value -o tsv) \
  --environment-variables \
    GRAPHRAG_API_KEY="$GRAPHRAG_API_KEY" \
    AZURE_STORAGE_CONNECTION_STRING="$CONNECTION_STRING"

Optional: Azure Cosmos DB storage

For enhanced scalability, use Azure Cosmos DB:

Create Cosmos DB account

az cosmosdb create \
  --name graphrag-cosmos \
  --resource-group graphrag-resources \
  --kind GlobalDocumentDB

Configure in settings.yaml

output:
  type: cosmosdb
  connection_string: ${COSMOS_CONNECTION_STRING}
  database_name: graphrag
  container_name: output

Cost optimization

Model selection
Rate limiting
Storage tiers
Provisioned throughput

Choose cost-effective models:

Use gpt-3.5-turbo instead of gpt-4 for initial testing
Use text-embedding-3-small instead of text-embedding-3-large

Configure rate limits to control costs:

completion_models:
  default_completion_model:
    rate_limit:
      requests_per_period: 60
      period_in_seconds: 60

Monitoring and logging

Enable diagnostics

Enable diagnostic logging for Azure OpenAI:

az monitor diagnostic-settings create \
  --name graphrag-diagnostics \
  --resource "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.CognitiveServices/accounts/graphrag-openai" \
  --logs '[{"category": "RequestResponse", "enabled": true}]' \
  --metrics '[{"category": "AllMetrics", "enabled": true}]' \
  --workspace YOUR_LOG_ANALYTICS_WORKSPACE_ID

Set up alerts

Create alerts for cost and performance:

az monitor metrics alert create \
  --name high-token-usage \
  --resource-group graphrag-resources \
  --scopes "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.CognitiveServices/accounts/graphrag-openai" \
  --condition "total ProcessedPromptTokens > 1000000" \
  --description "Alert when token usage exceeds threshold"

Security best practices

Use managed identities

Avoid storing credentials, use Azure Managed Identity

Private endpoints

Configure private endpoints for Azure services

Network security

Implement VNet integration and firewall rules

Key rotation

Automate API key rotation using Key Vault

Next steps

Multi-lingual support

Deploy GraphRAG for multiple languages

Enterprise knowledge

Enterprise deployment patterns

Configuration reference

Complete configuration guide

Azure documentation

Azure OpenAI documentation

Tutorials

Notebooks

Use Cases

Why Azure?

Prerequisites

Set up Azure OpenAI

Configure Azure Storage

Configure GraphRAG for Azure

Using managed identity (recommended)

Deploy to Azure Container Instances

Optional: Azure Cosmos DB storage

Cost optimization

Monitoring and logging

Security best practices

Use managed identities

Private endpoints

Network security

Key rotation

Next steps

Multi-lingual support

Enterprise knowledge

Configuration reference

Azure documentation

Build docs developers (and LLMs) love

Tutorials

Notebooks

Use Cases

​Why Azure?

​Prerequisites

​Set up Azure OpenAI

​Configure Azure Storage

​Configure GraphRAG for Azure

​Using managed identity (recommended)

​Deploy to Azure Container Instances

​Optional: Azure Cosmos DB storage

​Cost optimization

​Monitoring and logging

​Security best practices

Use managed identities

Private endpoints

Network security

Key rotation

​Next steps

Multi-lingual support

Enterprise knowledge

Configuration reference

Azure documentation

Build docs developers (and LLMs) love

Why Azure?

Prerequisites

Set up Azure OpenAI

Configure Azure Storage

Configure GraphRAG for Azure

Using managed identity (recommended)

Deploy to Azure Container Instances

Optional: Azure Cosmos DB storage

Cost optimization

Monitoring and logging

Security best practices

Next steps