Provider Support

Overview

vLLora provides unified access to multiple AI providers through a single OpenAI-compatible API. Each provider is implemented as a Rust module with full support for streaming, tool calling, and advanced features.

Supported Providers

OpenAI

GPT-3.5, GPT-4, GPT-4o series with Azure OpenAI support

Anthropic

Claude 3 Opus, Sonnet, and Haiku models

Google Gemini

Gemini Pro and Ultra with Vertex AI integration

AWS Bedrock

Multi-model access including Claude, Llama, and Titan

Provider Architecture

All providers implement a common ModelInstance trait defined in llm/src/types/instance.rs:

#[async_trait]
pub trait ModelInstance: Send + Sync {
    async fn execute(
        &self,
        messages: Vec<Message>,
        sender: Option<Sender<ModelEvent>>,
    ) -> LLMResult<ChatCompletionMessageWithFinishReason>;

    async fn execute_stream(
        &self,
        messages: Vec<Message>,
    ) -> LLMResult<Pin<Box<dyn Stream<Item = Result<ChatCompletionChunk, ModelError>> + Send>>>;
}

OpenAI Provider

Implementation

Located in llm/src/provider/openai/mod.rs:

pub fn openai_client(
    credentials: Option<&ApiKeyCredentials>,
    endpoint: Option<&str>,
) -> Result<Client<OpenAIConfig>, ModelError> {
    let api_key = if let Some(credentials) = credentials {
        credentials.api_key.clone()
    } else {
        std::env::var("VLLORA_OPENAI_API_KEY")
            .map_err(|_| AuthorizationError::InvalidApiKey)?
    };

    let mut config = OpenAIConfig::new();
    config = config.with_api_key(api_key);

    if let Some(endpoint) = endpoint {
        config = config.with_api_base(endpoint);
    }

    Ok(Client::with_config(config))
}

Azure OpenAI Support

vLLora automatically detects and handles Azure endpoints:

pub fn is_azure_endpoint(endpoint: &str) -> bool {
    endpoint.contains("azure.com")
}

pub fn azure_openai_client(
    api_key: String,
    endpoint: &str,
    deployment_id: &str,
) -> Client<AzureConfig> {
    let azure_config = AzureConfig::new()
        .with_api_base(endpoint)
        .with_api_version("2024-10-21".to_string())
        .with_api_key(api_key)
        .with_deployment_id(deployment_id.to_string());

    Client::with_config(azure_config)
}

OpenAI provider supports all standard features: streaming, function calling, vision, and JSON mode.

Anthropic Provider

Claude Models

Implemented in llm/src/provider/anthropic.rs using the clust SDK:

pub fn anthropic_client(
    credentials: Option<&ApiKeyCredentials>,
) -> Result<clust::Client, ModelError> {
    let api_key = if let Some(credentials) = credentials {
        credentials.api_key.clone()
    } else {
        std::env::var("VLLORA_ANTHROPIC_API_KEY")
            .map_err(|_| AuthorizationError::InvalidApiKey)?
    };
    let client = Client::from_api_key(clust::ApiKey::new(api_key));
    Ok(client)
}

Message Conversion

Anthropic’s Messages API requires conversion from OpenAI format:

// System messages are extracted and sent separately
let system_prompt = messages
    .iter()
    .find(|m| matches!(m.message_type, MessageType::System))
    .map(|m| SystemPrompt::new(m.content_str()));

Tracing Integration

Every Anthropic call is traced:

use vllora_telemetry::create_model_span;

create_model_span!(
    operation_name: "anthropic_chat_completion",
    model: self.params.model_name.clone(),
    provider: "anthropic",
    // Additional attributes...
);

Google Gemini Provider

Vertex AI Integration

Implemented in llm/src/provider/gemini/:

pub struct GeminiModel {
    pub client: GeminiClient,
    pub params: GeminiModelParams,
    pub execution_options: ExecutionOptions,
    pub tools: HashMap<String, Arc<Box<dyn Tool>>>,
    pub credentials_ident: CredentialsIdent,
}

Gemini provider supports text and image inputs:

// Image content handling
ImageContentBlock {
    image: ImageContentSource::Base64 {
        media_type: "image/jpeg".to_string(),
        data: base64_data,
    }
}

Gemini models support both direct API access and Vertex AI endpoints for enterprise customers.

AWS Bedrock Provider

Multi-Model Support

Bedrock provides access to multiple model families:

// From llm/src/provider/bedrock/mod.rs
pub struct BedrockModel {
    pub client: Client,
    pub execution_options: ExecutionOptions,
    params: BedrockModelParams,
    pub tools: HashMap<String, Arc<Box<dyn VlloraTool>>>,
    pub model_name: String,
    pub credentials_ident: CredentialsIdent,
}

AWS Credentials

Bedrock supports multiple authentication methods:

IAM Credentials
AWS Profile

BedrockCredentials::IAM(IAMCredentials {
    access_key_id: "...".to_string(),
    secret_access_key: "...".to_string(),
    region: Some("us-east-1".to_string()),
    session_token: None,
})

pub async fn get_sdk_config(
    credentials: Option<&BedrockCredentials>,
) -> Result<SdkConfig, ModelError> {
    Ok(match credentials {
        Some(BedrockCredentials::IAM(creds)) => {
            get_user_shared_config(creds.clone()).await.load().await
        }
        None => {
            get_shared_config().await.load().await
        }
    })
}

Converse API

Bedrock uses the unified Converse API:

use aws_sdk_bedrockruntime::types::{
    ContentBlock, ConversationRole, Message,
    InferenceConfiguration, ToolConfiguration
};

let response = client
    .converse()
    .model_id(model_name)
    .messages(message)
    .set_system(system_prompts)
    .set_tool_config(tool_config)
    .send()
    .await?;

Provider Selection

vLLora determines which provider to use based on:

Model Name Pattern: gpt-4 → OpenAI, claude-3 → Anthropic, etc.
Explicit Provider: Specified in routing configuration
Endpoint URL: Azure endpoints automatically route to Azure OpenAI

Provider Enum

From llm/src/types/provider.rs:

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "lowercase")]
pub enum InferenceModelProvider {
    OpenAI,
    Anthropic,
    Gemini,
    Bedrock,
    #[serde(alias = "vertex-ai")]
    VertexAI,
    Proxy(String),
}

Credential Management

Per-Project Credentials

Each project can have its own credentials for each provider:

// From core/src/metadata/services/provider_credential.rs
pub struct ProviderCredential {
    pub id: Uuid,
    pub project_id: Uuid,
    pub provider_id: Uuid,
    pub credentials: EncryptedCredentials,
    pub created_at: NaiveDateTime,
    pub updated_at: NaiveDateTime,
}

Credential Resolution

The ProviderKeyResolver retrieves the appropriate credentials:

pub trait ProviderKeyResolver {
    fn resolve_key(
        &self,
        project_id: Uuid,
        provider: InferenceModelProvider,
    ) -> Result<Option<Credentials>, CredentialError>;
}

Model Pricing

vLLora includes pricing information for accurate cost tracking:

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct CompletionModelPrice {
    pub per_input_token: f64,
    pub per_output_token: f64,
    pub per_cached_input_token: Option<f64>,
    pub per_cached_input_write_token: Option<f64>,
    pub valid_from: Option<NaiveDate>,
}

Pricing data is embedded in gateway/models_data.json for fast startup and offline operation.

Adding Custom Providers

vLLora supports custom provider proxies:

InferenceModelProvider::Proxy("my-custom-provider".to_string())

Custom providers can implement OpenAI-compatible endpoints and will be routed through the proxy system.

Get Started

Core Concepts

Features

Guides

Provider Support

Overview

Supported Providers

OpenAI

Anthropic

Google Gemini

AWS Bedrock

Provider Architecture

OpenAI Provider

Implementation

Azure OpenAI Support

Anthropic Provider

Claude Models

Message Conversion

Tracing Integration

Google Gemini Provider

Vertex AI Integration

AWS Bedrock Provider

Multi-Model Support

AWS Credentials

Converse API

Provider Selection

Provider Enum

Credential Management

Per-Project Credentials

Credential Resolution

Model Pricing

Adding Custom Providers

Build docs developers (and LLMs) love

Get Started

Core Concepts

Features

Guides

​Overview

​Supported Providers

OpenAI

Anthropic

Google Gemini

AWS Bedrock

​Provider Architecture

​OpenAI Provider

​Implementation

​Azure OpenAI Support

​Anthropic Provider

​Claude Models

​Message Conversion

​Tracing Integration

​Google Gemini Provider

​Vertex AI Integration

​Multi-Modal Support

​AWS Bedrock Provider

​Multi-Model Support

​AWS Credentials

​Converse API

​Provider Selection

​Provider Enum

​Credential Management

​Per-Project Credentials

​Credential Resolution

​Model Pricing

​Adding Custom Providers

Build docs developers (and LLMs) love

Overview

Supported Providers

Provider Architecture

OpenAI Provider

Implementation

Azure OpenAI Support

Anthropic Provider

Claude Models

Message Conversion

Tracing Integration

Google Gemini Provider

Vertex AI Integration

Multi-Modal Support

AWS Bedrock Provider

Multi-Model Support

AWS Credentials

Converse API

Provider Selection

Provider Enum

Credential Management

Per-Project Credentials

Credential Resolution

Model Pricing

Adding Custom Providers