Overview
Azure support in Metaflow includes:- Azure Blob Storage: Scalable object storage for artifacts and data
- Azure Key Vault: Secure secrets and credential management
- Azure Container Registry (ACR): Container image storage
- Kubernetes: Compute execution on Azure Kubernetes Service (AKS)
Azure compute is provided via Kubernetes. See the Kubernetes documentation for compute configuration.
Setup
Prerequisites
- Azure account with active subscription
- Azure CLI installed and configured
- Metaflow installed:
pip install metaflow - Azure SDK packages:
pip install azure-storage-blob azure-identity azure-keyvault-secrets
Authentication
Metaflow uses Azure DefaultAzureCredential, which supports multiple authentication methods:Azure Blob Storage
Configure Metaflow to use Azure Blob Storage as the datastore.Configuration
Storage Account Setup
Blob Storage Path Format
Metaflow uses a specific path format for Azure Blob Storage:mycontainer/metaflow- Container “mycontainer” with prefix “metaflow”production/workflows- Container “production” with prefix “workflows”
Using Blob Storage in Code
Metaflow automatically handles artifact storage:Direct Blob Access
For direct blob operations, use the Azure SDK:Azure Key Vault
Securely manage secrets and credentials using Azure Key Vault.Configuration
Key Vault Setup
Using Secrets
Metaflow’s@secrets decorator integrates with Azure Key Vault:
Secret Naming
Azure Key Vault secret names must follow these rules:- 1-127 characters
- Start with a letter
- Contain only alphanumeric characters and hyphens
- Example:
my-api-key,DatabasePassword,api-key-v2
Secret ID Formats
Metaflow supports multiple secret ID formats:1. Simple Name (Requires Prefix)
2. Name with Version
3. Full URL
4. Full URL with Version
Custom Environment Variable Names
Azure Container Registry
Use Azure Container Registry for custom Docker images.Setup
Using ACR Images
Kubernetes Compute
For compute execution on Azure, use Azure Kubernetes Service with Metaflow:See the Kubernetes documentation for detailed AKS setup and configuration.
Azure RBAC Permissions
Required Azure role assignments:Storage Access
Microsoft.Storage/storageAccounts/blobServices/containers/readMicrosoft.Storage/storageAccounts/blobServices/containers/writeMicrosoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action
Key Vault Access
Container Registry Access
Best Practices
Storage Optimization
Storage Optimization
- Use appropriate storage tier (Hot, Cool, Archive)
- Implement lifecycle management policies
- Enable soft delete for recovery
- Use managed identities instead of connection strings
- Monitor storage costs with Azure Cost Management
Security
Security
- Always use Azure RBAC over access keys
- Enable storage account firewall rules
- Use private endpoints for storage access
- Rotate Key Vault secrets regularly
- Enable Azure Defender for storage
Performance
Performance
- Choose storage account in same region as compute
- Use appropriate workload type setting
- Enable blob versioning for important data
- Use blob index tags for efficient queries
Monitoring
Monitoring
- Enable diagnostic logging for storage
- Monitor Key Vault access logs
- Set up alerts for authentication failures
- Track storage metrics and capacity
Troubleshooting
Authentication Errors
Authentication Errors
Problem:
ClientAuthenticationError when accessing storage or Key VaultSolutions:- Verify Azure CLI login:
az account show - Check service principal credentials are set correctly
- Ensure managed identity is enabled on Azure resource
- Verify RBAC role assignments
- Check Azure AD token hasn’t expired
Blob Storage Access Denied
Blob Storage Access Denied
Problem: Cannot read or write blobsSolutions:
- Verify Storage Blob Data Contributor role is assigned
- Check storage account firewall rules
- Ensure container name is correct
- Verify path format (no leading/trailing slashes)
- Check if storage account requires private endpoint access
Key Vault Secret Not Found
Key Vault Secret Not Found
Problem: Secret retrieval failsSolutions:
- Verify secret name follows naming rules
- Check Key Vault access policy or RBAC permissions
- Ensure METAFLOW_AZURE_KEY_VAULT_PREFIX is set correctly
- Verify Key Vault firewall allows access
- Check secret hasn’t been deleted (check soft-delete)
Invalid Path Format
Invalid Path Format
Problem:
ValueError when parsing blob pathSolutions:- Remove
https://prefix from path - Remove leading and trailing slashes
- Use format:
container/prefixnotcontainer/prefix/ - Check for consecutive slashes in path
Configuration Reference
Environment Variables
| Variable | Description | Example |
|---|---|---|
METAFLOW_DEFAULT_DATASTORE | Set to “azure” | azure |
METAFLOW_DATASTORE_SYSROOT_AZURE | Container and path | mycontainer/metaflow |
METAFLOW_AZURE_STORAGE_BLOB_SERVICE_ENDPOINT | Custom endpoint URL | https://account.blob.core.windows.net |
METAFLOW_AZURE_STORAGE_WORKLOAD_TYPE | Workload optimization | default, highCpu, highMemory |
METAFLOW_AZURE_KEY_VAULT_PREFIX | Key Vault URL | https://my-kv.vault.azure.net |
METAFLOW_DEFAULT_AZURE_CLIENT_PROVIDER | Auth provider | azure-default |
Next Steps
Kubernetes on AKS
Set up compute on Azure Kubernetes Service
Argo Workflows
Deploy production workflows on Azure
Multi-Cloud Overview
Compare cloud platform features
Secrets Management
Advanced secrets management patterns
