Skip to main content

Overview

HAI Build Code Generator supports multiple LLM providers, giving you flexibility to use the best models for your workflow. Configure API keys, base URLs, and model-specific settings for each provider.

Supported Providers

HAI Build integrates with the following LLM providers:

Anthropic

Claude models with prompt caching

OpenAI

GPT models including GPT-4 and reasoning models

OpenRouter

Access multiple providers through one API

Google Gemini

Gemini models with native API

AWS Bedrock

Claude and other models via AWS

Vertex AI

Google Cloud AI models

Azure OpenAI

OpenAI models via Azure

Ollama

Local model execution

Groq

Ultra-fast inference

Mistral

Mistral AI models

DeepSeek

DeepSeek reasoning models

Cerebras

High-performance inference
Additional providers: Together AI, Fireworks, Hugging Face, LiteLLM, LM Studio, SambaNova, xAI, Qwen, Moonshot, Nebius, SAP AI Core, and more.

Provider Configuration

Anthropic (Claude)

Configure Claude models with support for prompt caching and extended thinking.
apiKey
string
required
Your Anthropic API key from console.anthropic.com
baseUrl
string
default:"https://api.anthropic.com"
Custom base URL for Anthropic API (optional)
thinkingBudgetTokens
number
default:"0"
Token budget for extended thinking mode (reasoning models)

Supported Models

  • claude-3-5-sonnet-20241022 - Latest Claude 3.5 Sonnet
  • claude-3-5-haiku-20241022 - Fast and efficient
  • claude-3-opus-20240229 - Most capable model
  • claude-sonnet-4-20250514 - Claude 4 Sonnet

Features

Anthropic models support prompt caching to reduce costs on repeated context:
// System prompts are automatically cached
system: [{
  text: systemPrompt,
  type: "text",
  cache_control: { type: "ephemeral" }
}]
Enable reasoning capabilities with thinking budget:
thinking: {
  type: "enabled",
  budget_tokens: 10000
}
Thinking is not compatible with temperature, top_p, or top_k modifications.
Some models support extended context windows:
  • Add -1m suffix to model ID
  • Requires beta header: anthropic-beta: context-1m-2025-08-07

Example Configuration

{
  "provider": "anthropic",
  "apiKey": "sk-ant-...",
  "modelId": "claude-3-5-sonnet-20241022",
  "thinkingBudgetTokens": 5000
}

Advanced Provider Settings

Custom Headers

Add custom HTTP headers for API requests:
{
  openAiHeaders: {
    "X-Custom-Header": "value",
    "Authorization": "Bearer custom-token"
  }
}

Proxy Support

HAI Build respects system proxy settings. All providers use a configured fetch implementation with proxy support:
import { fetch } from "@/shared/net"

const client = new Anthropic({
  apiKey: apiKey,
  fetch, // Proxy-aware fetch
})

Retry Logic

All providers implement automatic retry with exponential backoff:
@withRetry()
async *createMessage(...) {
  // API call with automatic retry on transient failures
}

Model Information

Each provider exposes model metadata:
interface ModelInfo {
  maxTokens: number              // Maximum output tokens
  contextWindow: number          // Input context size
  supportsPromptCache: boolean   // Prompt caching support
  supportsReasoning: boolean     // Reasoning capabilities
  inputPrice: number             // Price per 1M input tokens
  outputPrice: number            // Price per 1M output tokens
  cacheWritesPrice?: number      // Cache write cost
  cacheReadsPrice?: number       // Cache read cost
}

Cost Optimization

Prompt Caching

Providers that support prompt caching (Anthropic, Gemini) can significantly reduce costs:
  • System prompts are automatically cached
  • Cache breakpoints minimize redundant processing
  • Costs are split between immediate and ongoing storage

Reasoning Budgets

Control reasoning token usage:
// Anthropic
thinkingBudgetTokens: 5000

// OpenAI
reasoningEffort: "medium"  // or "low", "high", "xhigh"

// Gemini
thinking: {
  level: ThinkingLevel.LOW,
  budget_tokens: 3000
}

Troubleshooting

Error: API key is requiredSolution: Ensure your API key is correctly set:
  • Check for typos
  • Verify the key is active
  • Confirm it has the necessary permissions
Error: 429 Too Many RequestsSolution: Automatic retry handles most rate limits. For persistent issues:
  • Upgrade your provider tier
  • Implement request throttling
  • Consider using OpenRouter for automatic fallback
Error: Connection fails to Azure endpointSolution: Verify your configuration:
  • Ensure baseUrl includes full Azure domain
  • Check azureApiVersion is current
  • For Azure Identity, verify permissions in Azure Portal
Error: Model ID not recognizedSolution:
  • Check model ID spelling and format
  • Verify model is available in your region
  • Ensure you have access to the model tier

Next Steps

Settings

Configure extension settings

Telemetry

Set up monitoring and analytics

Build docs developers (and LLMs) love