Skip to main content

Overview

Airi integrates with numerous LLM providers through the @xsai package ecosystem, offering a unified interface for text generation, chat completions, embeddings, and more. The system supports both cloud-based and local model providers.

Supported Providers

Airi supports 20+ LLM providers out of the box:

Major Cloud Providers

  • OpenAI - GPT-4, GPT-4 Turbo, GPT-3.5
  • Anthropic - Claude Sonnet, Claude Opus, Claude Haiku
  • Google - Gemini Pro, Gemini Ultra
  • Groq - Fast inference for open models
  • DeepSeek - Chinese LLM provider
  • Mistral AI - Mistral Large, Mistral Medium

Aggregator Services

  • OpenRouter - Access to 100+ models
  • Together AI - Open model hosting
  • Fireworks AI - Fast inference platform
  • Novita AI - Multi-model platform

Specialized Providers

  • Perplexity AI - Search-augmented models
  • Cerebras - Ultra-fast inference
  • Moonshot AI - Long-context models
  • Minimax - Multimodal AI
  • xAI - Grok models

Chinese Providers

  • Aliyun (302.ai) - Alibaba Cloud models
  • Moonshot AI - Kimi models
  • ModelScope - Alibaba model hub

Local/Self-Hosted

  • Ollama - Run models locally
  • OpenAI-Compatible - Any OpenAI API-compatible server

Provider Configuration

OpenAI

import { providerOpenAI } from '@proj-airi/stage-ui/libs/providers/providers/openai'
import { createOpenAI } from '@xsai-ext/providers/create'

// Configure provider
const config = {
  apiKey: 'sk-...', // Your OpenAI API key
  baseUrl: 'https://api.openai.com/v1' // Optional, defaults to OpenAI
}

// Create provider instance
const provider = createOpenAI(config.apiKey, config.baseUrl)

// Use with streaming
import { streamText } from '@xsai/stream-text'

const stream = await streamText({
  provider: provider.chatModel('gpt-4-turbo'),
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
})

for await (const chunk of stream.textStream) {
  console.log(chunk)
}

Anthropic (Claude)

import { providerAnthropic } from '@proj-airi/stage-ui/libs/providers/providers/anthropic'

const config = {
  apiKey: 'sk-ant-...', // Your Anthropic API key
  baseUrl: 'https://api.anthropic.com/v1/' // Optional
}

// Create provider with special headers
const provider = createChatProvider({
  apiKey: config.apiKey,
  baseURL: config.baseUrl,
  fetch: async (input, init) => {
    init.headers = {
      ...init.headers,
      'anthropic-dangerous-direct-browser-access': 'true'
    }
    return fetch(input, init)
  }
})

// Available models
const models = [
  'claude-sonnet-4-5-20250929',  // Smartest for complex tasks
  'claude-haiku-4-5-20251001',   // Fastest with near-frontier intelligence
  'claude-opus-4-1-20250805'     // Specialized reasoning
]

Google Gemini

import { createGoogleGenerativeAI } from '@xsai-ext/providers/create'

const config = {
  apiKey: 'AIza...', // Your Google AI API key
  baseUrl: 'https://generativelanguage.googleapis.com/v1beta'
}

const provider = createGoogleGenerativeAI(config.apiKey, config.baseUrl)

// Use Gemini models
const response = await streamText({
  provider: provider.chatModel('gemini-2.0-flash-exp'),
  messages: [{ role: 'user', content: 'Explain quantum computing' }]
})

Ollama (Local Models)

import { providerOllama } from '@proj-airi/stage-ui/libs/providers/providers/ollama'
import { createOllama } from '@xsai-ext/providers/create'

const config = {
  baseUrl: 'http://localhost:11434/v1/',  // Ollama server URL
  headers: {} // Optional custom headers
}

const provider = createOllama('', config.baseUrl)

// List available models
const models = await provider.listModels()
console.log(models)

// Use a local model
const stream = await streamText({
  provider: provider.chatModel('llama3.2:3b'),
  messages: [{ role: 'user', content: 'Hello' }]
})

OpenAI-Compatible Servers

import { createOpenAI } from '@xsai-ext/providers/create'

// Works with:
// - LM Studio
// - LocalAI
// - vLLM
// - Text Generation WebUI
// - Any OpenAI-compatible API

const provider = createOpenAI(
  'not-needed',  // API key (some servers don't need it)
  'http://localhost:1234/v1'  // Your server URL
)

const response = await streamText({
  provider: provider.chatModel('model-name'),
  messages: [{ role: 'user', content: 'Test message' }]
})

Using Providers in Airi

Provider Store

import { useProvidersStore } from '@proj-airi/stage-ui/stores/providers'
import { storeToRefs } from 'pinia'

const providersStore = useProvidersStore()
const { allChatProvidersMetadata } = storeToRefs(providersStore)

// Get all available providers
const providers = allChatProvidersMetadata.value
// Returns array of provider metadata with id, name, icon, tasks, etc.

// Get provider instance
const provider = await providersStore.getProviderInstance('openai')

// Get provider configuration
const config = providersStore.getProviderConfig('openai')

// Save provider configuration
await providersStore.saveProviderConfig('openai', {
  apiKey: 'sk-...',
  baseUrl: 'https://api.openai.com/v1'
})

// Validate provider
const validation = await providersStore.validateProvider('openai')
if (validation.valid) {
  console.log('Provider is configured correctly')
}

Model Management

const providersStore = useProvidersStore()

// Fetch available models for a provider
await providersStore.fetchModelsForProvider('openai')

// Get cached models
const models = providersStore.getModelsForProvider('openai')
console.log(models)
// [{ id: 'gpt-4-turbo', name: 'GPT-4 Turbo', provider: 'openai', ... }]

// Check if provider supports model listing
const metadata = providersStore.getProviderMetadata('openai')
if (metadata?.capabilities.listModels) {
  // Provider supports dynamic model listing
}

// Check loading state
if (providersStore.isLoadingModels['openai']) {
  console.log('Loading models...')
}

// Check for errors
const error = providersStore.modelLoadError['openai']
if (error) {
  console.error('Failed to load models:', error)
}

Chat Integration

import { streamText } from '@xsai/stream-text'
import { useProvidersStore } from '@proj-airi/stage-ui/stores/providers'

const providersStore = useProvidersStore()

// Get configured provider and model
const providerId = 'openai'
const modelId = 'gpt-4-turbo'

const provider = await providersStore.getProviderInstance(providerId)
const chatModel = provider.chatModel(modelId)

// Stream chat response
const response = await streamText({
  provider: chatModel,
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Tell me a joke' }
  ],
  temperature: 0.7,
  maxTokens: 500
})

// Process stream
for await (const chunk of response.textStream) {
  console.log(chunk)
}

// Or get full text
const fullText = await response.text

Provider Features

Text Generation

import { generateText } from '@xsai/generate-text'

const result = await generateText({
  provider: provider.chatModel('gpt-4-turbo'),
  messages: [{ role: 'user', content: 'What is 2+2?' }],
  temperature: 0,
  maxTokens: 100
})

console.log(result.text)
console.log(result.usage) // Token usage

Streaming

import { streamText } from '@xsai/stream-text'

const stream = await streamText({
  provider: provider.chatModel('gpt-4-turbo'),
  messages: [{ role: 'user', content: 'Count to 10' }]
})

// Stream chunks
for await (const chunk of stream.textStream) {
  process.stdout.write(chunk)
}

// Or get deltas
for await (const delta of stream.deltaStream) {
  console.log(delta) // Individual token or chunk
}

Embeddings

import { embed } from '@xsai/embed'

const embeddings = await embed({
  provider: provider.textEmbeddingModel('text-embedding-3-small'),
  values: [
    'Hello world',
    'Goodbye world'
  ]
})

console.log(embeddings.embeddings)
// [[0.123, -0.456, ...], [0.789, -0.012, ...]]

Function Calling / Tools

import { streamText } from '@xsai/stream-text'
import { tool } from '@xsai/tool'

const weatherTool = tool({
  name: 'get_weather',
  description: 'Get the current weather for a location',
  parameters: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'City name'
      }
    },
    required: ['location']
  },
  execute: async ({ location }) => {
    // Fetch weather data
    return { temperature: 72, condition: 'sunny' }
  }
})

const response = await streamText({
  provider: provider.chatModel('gpt-4-turbo'),
  messages: [{ role: 'user', content: 'What is the weather in Tokyo?' }],
  tools: { get_weather: weatherTool }
})

Local Model Setup

Installing Ollama

# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh

# Windows
# Download from https://ollama.com/download

Running Models

# Pull a model
ollama pull llama3.2:3b

# List available models
ollama list

# Run Ollama server (usually runs automatically)
ollama serve
// Small & Fast (1-3B parameters)
const fastModels = [
  'llama3.2:3b',      // Best small model
  'phi3:3b',          // Microsoft's efficient model
  'gemma2:2b'         // Google's compact model
]

// Medium (7-13B parameters)
const mediumModels = [
  'llama3.2:8b',      // Balanced performance
  'mistral:7b',       // Strong reasoning
  'qwen2.5:7b'        // Multilingual
]

// Large (30B+ parameters)
const largeModels = [
  'llama3.1:70b',     // Near GPT-4 performance
  'qwen2.5:72b',      // Excellent coding
  'deepseek-coder-v2:16b'  // Best for code
]

Provider Validation

Airi includes built-in provider validation:
import { providerOpenAI } from '@proj-airi/stage-ui/libs/providers/providers/openai'

const provider = providerOpenAI

// Check if validation is required
if (provider.validationRequiredWhen(config)) {
  // Run validators
  for (const validator of provider.validators.validateConfig) {
    const result = await validator.validator(config)
    
    if (!result.valid) {
      console.error('Validation failed:', result.reason)
    }
  }
  
  // Validate provider connectivity
  for (const validator of provider.validators.validateProvider) {
    const result = await validator.validator(config)
    
    if (!result.valid) {
      console.error('Connection failed:', result.reason)
    }
  }
}

Best Practices

  1. API Key Security: Never hardcode API keys, use environment variables
  2. Rate Limiting: Implement retry logic with exponential backoff
  3. Error Handling: Handle network errors and API rate limits gracefully
  4. Streaming: Use streaming for better UX with long responses
  5. Model Selection: Choose appropriate model size for your use case
  6. Cost Optimization: Monitor token usage and cache when possible
  7. Local Fallback: Configure Ollama as fallback for privacy-sensitive scenarios

Troubleshooting

Connection Failed

// Check provider configuration
const config = providersStore.getProviderConfig('openai')
console.log('Config:', config)

// Verify API key
if (!config.apiKey || config.apiKey.trim() === '') {
  console.error('API key is missing')
}

// Test connectivity
try {
  const response = await fetch(`${config.baseUrl}/models`, {
    headers: {
      'Authorization': `Bearer ${config.apiKey}`
    }
  })
  console.log('Status:', response.status)
} catch (error) {
  console.error('Network error:', error)
}

Ollama Not Responding

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Restart Ollama
ollama serve

# Check logs
journalctl -u ollama -f  # Linux
# or check system logs on macOS/Windows

Model Not Found

// List available models first
const models = await provider.listModels()
console.log('Available models:', models.map(m => m.id))

// Verify model ID matches exactly
const modelId = 'gpt-4-turbo'  // Must match provider's model ID format

Rate Limit Errors

import { generateText } from '@xsai/generate-text'

// Implement retry with backoff
async function generateWithRetry(options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await generateText(options)
    } catch (error) {
      if (error.message.includes('rate limit') && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000  // Exponential backoff
        await new Promise(resolve => setTimeout(resolve, delay))
        continue
      }
      throw error
    }
  }
}

Build docs developers (and LLMs) love