Supported Models - Tokenizador

Tokenizador supports 48 AI language models from 19 different providers. Each model has unique characteristics, pricing, and context limits.

Model Overview

48 Models

Comprehensive coverage of major AI providers

19 Providers

From OpenAI and Anthropic to specialized providers

Real Pricing

Actual costs per 1M tokens, updated regularly

OpenAI Models

OpenAI models use two primary encodings: o200k_base (newer, more efficient) for GPT-4o family, and cl100k_base for GPT-4 and GPT-3.5 families.

GPT-4o Family

GPT-4o
GPT-4o Mini

// From models-config.js:83-92
{
  name: 'GPT-4o',
  company: 'OpenAI',
  encoding: 'o200k_base',
  contextLimit: 128000,
  inputCost: 2.50,    // $ per 1M tokens
  outputCost: 10.00,
  url: 'https://artificialanalysis.ai/models/gpt-4o',
  tokenRatio: 1.0
}

Key Features:

Latest encoding technology (o200k_base)
128K token context window
Balanced cost and performance
Multimodal capabilities

// From models-config.js:93-102
{
  name: 'GPT-4o Mini',
  company: 'OpenAI',
  encoding: 'o200k_base',
  contextLimit: 128000,
  inputCost: 0.15,    // 94% cheaper than GPT-4o
  outputCost: 0.60,
  url: 'https://artificialanalysis.ai/models/gpt-4o-mini',
  tokenRatio: 1.0
}

Best For:

High-volume applications
Cost-sensitive workloads
Fast response times
Same tokenization as GPT-4o

GPT-4 Family

Model	Context Limit	Input Cost	Output Cost	Encoding
GPT-4 Turbo	128,000	$10.00	$30.00	cl100k_base
GPT-4	8,192	$30.00	$60.00	cl100k_base
GPT-3.5 Turbo	16,385	$0.50	$1.50	cl100k_base

GPT-3.5 Turbo offers the best value for simple tasks with 33x lower cost than GPT-4.

Anthropic Models

Claude 3.5 & Claude 3 Family

// Claude models from models-config.js
// Note: tokenRatio of 1.1 indicates ~10% more tokens than OpenAI models

Claude 3.5 Sonnet

Specifications:

Context: 200,000 tokens
Input: $3.00 per 1M
Output: $15.00 per 1M
Token Ratio: 1.1x

Latest and most capable Claude model with massive context window.

Claude 3 Opus

Specifications:

Context: 200,000 tokens
Input: $15.00 per 1M
Output: $75.00 per 1M
Token Ratio: 1.1x

Most powerful Claude 3 model for complex tasks.

Claude 3 Sonnet

Specifications:

Context: 200,000 tokens
Input: $3.00 per 1M
Output: $15.00 per 1M
Token Ratio: 1.1x

Balanced performance and cost for most use cases.

Claude 3 Haiku

Specifications:

Context: 200,000 tokens
Input: $0.25 per 1M
Output: $1.25 per 1M
Token Ratio: 1.1x

Fastest and most affordable Claude model.

Claude models typically generate ~10% more tokens than GPT models for the same text due to different tokenization algorithms. Factor this into cost calculations.

Google Models

Gemini 1.5 Series

Gemini 1.5 Pro
Gemini 1.5 Flash

// models-config.js:173-182
{
  name: 'Gemini 1.5 Pro',
  company: 'Google',
  encoding: 'cl100k_base',  // Approximation
  contextLimit: 2097152,    // 2M tokens!
  inputCost: 1.25,
  outputCost: 5.00,
  tokenRatio: 1.05
}

Standout Feature: Largest context window available - over 2 million tokens!Perfect for:

Processing entire codebases
Long document analysis
Multi-document reasoning

// models-config.js:183-192
{
  name: 'Gemini 1.5 Flash',
  company: 'Google',
  encoding: 'cl100k_base',
  contextLimit: 1048576,    // 1M tokens
  inputCost: 0.075,
  outputCost: 0.30,
  tokenRatio: 1.05
}

Best Value: Extremely low cost with massive 1M token context.

Meta Models

Llama 3.1 Series (Latest)

Llama 3.1 405B - Flagship Model

{
  name: 'Llama 3.1 405B',
  contextLimit: 131072,  // 131K tokens
  inputCost: 2.70,
  outputCost: 2.70,      // Same price for input/output
  tokenRatio: 0.95       // 5% fewer tokens than GPT
}

Largest open-source model
Competitive with GPT-4
Token-efficient (5% fewer tokens)

Llama 3.1 70B - Sweet Spot

{
  name: 'Llama 3.1 70B',
  contextLimit: 131072,
  inputCost: 0.35,
  outputCost: 0.40,
  tokenRatio: 0.95
}

Best balance of cost and capability
87% cheaper than GPT-4o
131K context window

Llama 3.1 8B - Ultra Efficient

{
  name: 'Llama 3.1 8B',
  contextLimit: 131072,
  inputCost: 0.055,
  outputCost: 0.055,
  tokenRatio: 0.95
}

Lowest cost option
Surprisingly capable
Same 131K context as larger variants

Llama 3 Series (Previous Generation)

Model	Context	Input Cost	Output Cost	Best For
Llama 3 70B	8,192	$0.70	$0.80	Legacy applications
Llama 3 8B	8,192	$0.05	$0.05	Budget workloads

Llama 3.1 models offer significantly larger context (131K vs 8K) at similar or better pricing. Upgrade if possible.

Mistral AI Models

Mistral Large

128K Context | $2.00 /$ 6.00Premier model from Mistral AI

European AI provider
Strong multilingual support
Token ratio: 1.02x

Mistral Nemo

128K Context | $0.15 /$ 0.15Fast and affordable

Same pricing for input/output
Large context window
Token ratio: 1.02x

Mixtral 8x7B

32K Context | $0.24 /$ 0.24Mixture of Experts architecture

Efficient sparse activation
Good for diverse tasks
Token ratio: 1.02x

Mixtral 8x22B

65K Context | $0.65 /$ 0.65Larger MoE model

More parameters
Better performance
Token ratio: 1.02x

Cohere Models

// Cohere Command models - tokenRatio: 0.98 (slightly more efficient)

Command R+
Command R

Context: 128,000 tokens
Input: $2.50 per 1M
Output: $10.00 per 1M
Optimized for RAG (Retrieval Augmented Generation)

Specialized Providers

Alibaba (Qwen Models)

Qwen 2.5 & Qwen 2 Series

// Token ratio: 0.92 (8% more efficient than GPT)

Model	Context	Input	Output	Notes
Qwen2.5 72B	131,072	$0.35	$0.40	Latest version
Qwen2 72B	131,072	$0.35	$0.40	Stable release

Strong multilingual (especially Chinese)
Token efficient
Competitive pricing

DeepSeek

DeepSeek V2.5 & V2

// Token ratio: 0.93 (7% more efficient)

Context: 131,072 tokens
Input: $0.14 per 1M tokens
Output: $0.28 per 1M tokens
Excellent value proposition
Chinese AI research lab

01.AI (Yi Models)

Yi Large & Yi 1.5 34B

Model	Context	Input	Output	Token Ratio
Yi Large	32,768	$0.60	$0.60	0.97
Yi 1.5 34B	32,768	$0.30	$0.30	0.97

Founded by Kai-Fu Lee
Competitive performance
Mid-tier pricing

Microsoft (Phi Models)

Phi-3.5 & Phi-3 Series

Small but capable models optimized for efficiency:

// Token ratio: 1.03 (3% more tokens than GPT)

Model	Context	Input	Output
Phi-3.5 Mini	131,072	$0.15	$0.60
Phi-3 Medium	131,072	$1.00	$1.00
Phi-3 Mini	131,072	$0.15	$0.60

Small model size
Large context window
Good for edge deployment

AI21 Labs (Jamba Models)

Jamba 1.5 Large & Mini

Hybrid SSM-Transformer architecture:

// Token ratio: 0.94 (6% more efficient)

Model	Context	Input	Output
Jamba 1.5 Large	262,144	$0.50	$0.70
Jamba 1.5 Mini	262,144	$0.10	$0.10

Standout Feature: 256K token context window at competitive pricing!

xAI (Grok Models)

Grok-2 & Grok-2 Mini

From Elon Musk’s xAI:

// Token ratio: 1.01 (nearly identical to GPT)

Model	Context	Input	Output
Grok-2	131,072	$2.00	$10.00
Grok-2 Mini	131,072	$0.15	$0.60

Other Providers

Reka (Reka Core & Flash)

Model	Context	Input	Output	Token Ratio
Reka Core	131,072	$10.00	$25.00	0.99
Reka Flash	131,072	$0.15	$0.60	0.99

Amazon (Titan Text)

Model	Context	Input	Output	Token Ratio
Titan Text Premier	32,000	$0.50	$1.50	1.04
Titan Text Express	8,000	$0.13	$0.17	1.04

Perplexity (Llama Sonar)

Model	Context	Input	Output	Token Ratio
Llama 3.1 Sonar Large	131,072	$1.00	$1.00	0.95
Llama 3.1 Sonar Small	131,072	$0.20	$0.20	0.95

IBM (Granite)

Model	Context	Input	Output	Token Ratio
Granite 3 8B	131,072	$0.055	$0.055	0.96
Granite 3 2B	131,072	$0.025	$0.025	0.96

Nous Research (Hermes)

Model	Context	Input	Output	Token Ratio
Hermes 3 405B	131,072	$2.70	$2.70	0.95
Hermes 3 70B	131,072	$0.35	$0.40	0.95

Snowflake (Arctic)

Context: 4,096 tokens
Input: $0.24 per 1M
Output: $0.24 per 1M
Token Ratio: 1.06

NVIDIA (Nemotron)

Model	Context	Input	Output	Token Ratio
Nemotron 70B	131,072	$0.35	$0.40	0.98
Nemotron Mini	131,072	$0.15	$0.60	0.98

Model Selection Guide

By Use Case
By Token Efficiency
By Provider

High Accuracy Tasks:

GPT-4o, Claude 3.5 Sonnet, Claude 3 Opus
Gemini 1.5 Pro, Llama 3.1 405B

Cost-Sensitive:

GPT-4o Mini, Gemini 1.5 Flash
Llama 3.1 8B, DeepSeek V2.5
Granite 3 2B (lowest cost)

Large Context:

Gemini 1.5 Pro (2M tokens)
Jamba 1.5 (256K tokens)
Claude 3 family (200K tokens)

Fast Response:

GPT-3.5 Turbo, Claude 3 Haiku
Mistral Nemo, Gemini 1.5 Flash

Pricing Comparison

Budget Options (< $0.20 per 1M input tokens)

// Sorted by input cost
const budgetModels = [
  { model: 'Granite 3 2B',      input: 0.025 },
  { model: 'Llama 3 8B',        input: 0.05  },
  { model: 'Llama 3.1 8B',      input: 0.055 },
  { model: 'Gemini 1.5 Flash',  input: 0.075 },
  { model: 'Jamba 1.5 Mini',    input: 0.10  },
  { model: 'DeepSeek V2.5',     input: 0.14  },
  { model: 'GPT-4o Mini',       input: 0.15  },
  { model: 'Command R',         input: 0.15  }
];

Premium Options (> $5.00 per 1M input tokens)

const premiumModels = [
  { model: 'Reka Core',        input: 10.00 },
  { model: 'GPT-4 Turbo',      input: 10.00 },
  { model: 'Claude 3 Opus',    input: 15.00 },
  { model: 'GPT-4',            input: 30.00 }
];

Token Ratio Reference

Token ratio indicates how many tokens a model uses compared to GPT (baseline 1.0). Lower is more efficient.

// From models-config.js - tokenRatio values
const tokenRatios = {
  'Most Efficient': {
    'Qwen': 0.92,      // 8% fewer tokens
    'DeepSeek': 0.93,  // 7% fewer tokens
    'Jamba': 0.94,     // 6% fewer tokens
    'Llama': 0.95      // 5% fewer tokens
  },
  'Standard': {
    'GPT': 1.0,        // Baseline
    'Cohere': 0.98,    // 2% fewer tokens
    'NVIDIA': 0.98     // 2% fewer tokens
  },
  'Less Efficient': {
    'Mistral': 1.02,   // 2% more tokens
    'Microsoft': 1.03, // 3% more tokens
    'Amazon': 1.04,    // 4% more tokens
    'Google': 1.05,    // 5% more tokens
    'Claude': 1.1      // 10% more tokens
  }
};

Model Data Structure

All model data comes from models-config.js:

// Complete model configuration structure
const MODELS_DATA = {
  'model-id': {
    name: 'Display Name',
    company: 'Provider Name',
    encoding: 'o200k_base' | 'cl100k_base',
    contextLimit: 128000,          // Maximum tokens
    inputCost: 2.50,              // $ per 1M input tokens
    outputCost: 10.00,            // $ per 1M output tokens
    url: 'https://...',           // Artificial Analysis link
    tokenRatio: 1.0               // Efficiency vs GPT baseline
  }
};

External Resources

Artificial Analysis

Independent benchmarks and detailed model comparisons

OpenAI Tokenizer

Official OpenAI tokenization playground

Tiktoken Library

Open source tokenization library used by this tool

Model Pricing Updates

Track pricing changes across providers

Pricing Note: Model prices are updated regularly but may change. Always verify current pricing with the provider before production use.

Get Started

Guides

Architecture

​Model Overview

48 Models

19 Providers

Real Pricing

​OpenAI Models

​GPT-4o Family

​GPT-4 Family

​Anthropic Models

​Claude 3.5 & Claude 3 Family

Claude 3.5 Sonnet

Claude 3 Opus

Claude 3 Sonnet

Claude 3 Haiku

​Google Models

​Gemini 1.5 Series

​Meta Models

​Llama 3.1 Series (Latest)

​Llama 3 Series (Previous Generation)

​Mistral AI Models

Mistral Large

Mistral Nemo

Mixtral 8x7B

Mixtral 8x22B

​Cohere Models

​Specialized Providers

​Alibaba (Qwen Models)

​DeepSeek

​01.AI (Yi Models)

​Microsoft (Phi Models)

​AI21 Labs (Jamba Models)

​xAI (Grok Models)

​Other Providers

​Model Selection Guide

​Pricing Comparison

​Budget Options (< $0.20 per 1M input tokens)

​Premium Options (> $5.00 per 1M input tokens)

​Token Ratio Reference

​Model Data Structure

​External Resources

Artificial Analysis

OpenAI Tokenizer

Tiktoken Library

Model Pricing Updates

​Next Steps

How to Use

Understanding Tokenization

Build docs developers (and LLMs) love

Model Overview

OpenAI Models

GPT-4o Family

GPT-4 Family

Anthropic Models

Claude 3.5 & Claude 3 Family

Google Models

Gemini 1.5 Series

Meta Models

Llama 3.1 Series (Latest)

Llama 3 Series (Previous Generation)

Mistral AI Models

Cohere Models

Specialized Providers

Alibaba (Qwen Models)

DeepSeek

01.AI (Yi Models)

Microsoft (Phi Models)

AI21 Labs (Jamba Models)

xAI (Grok Models)

Other Providers

Model Selection Guide

Pricing Comparison

Budget Options (< $0.20 per 1M input tokens)

Premium Options (> $5.00 per 1M input tokens)

Token Ratio Reference

Model Data Structure

External Resources

Next Steps