Skip to main content
Tokenizador supports 48 AI language models from 19 different providers. Each model has unique characteristics, pricing, and context limits.

Model Overview

48 Models

Comprehensive coverage of major AI providers

19 Providers

From OpenAI and Anthropic to specialized providers

Real Pricing

Actual costs per 1M tokens, updated regularly

OpenAI Models

OpenAI models use two primary encodings: o200k_base (newer, more efficient) for GPT-4o family, and cl100k_base for GPT-4 and GPT-3.5 families.

GPT-4o Family

// From models-config.js:83-92
{
  name: 'GPT-4o',
  company: 'OpenAI',
  encoding: 'o200k_base',
  contextLimit: 128000,
  inputCost: 2.50,    // $ per 1M tokens
  outputCost: 10.00,
  url: 'https://artificialanalysis.ai/models/gpt-4o',
  tokenRatio: 1.0
}
Key Features:
  • Latest encoding technology (o200k_base)
  • 128K token context window
  • Balanced cost and performance
  • Multimodal capabilities

GPT-4 Family

ModelContext LimitInput CostOutput CostEncoding
GPT-4 Turbo128,000$10.00$30.00cl100k_base
GPT-48,192$30.00$60.00cl100k_base
GPT-3.5 Turbo16,385$0.50$1.50cl100k_base
GPT-3.5 Turbo offers the best value for simple tasks with 33x lower cost than GPT-4.

Anthropic Models

Claude 3.5 & Claude 3 Family

// Claude models from models-config.js
// Note: tokenRatio of 1.1 indicates ~10% more tokens than OpenAI models

Claude 3.5 Sonnet

Specifications:
  • Context: 200,000 tokens
  • Input: $3.00 per 1M
  • Output: $15.00 per 1M
  • Token Ratio: 1.1x
Latest and most capable Claude model with massive context window.

Claude 3 Opus

Specifications:
  • Context: 200,000 tokens
  • Input: $15.00 per 1M
  • Output: $75.00 per 1M
  • Token Ratio: 1.1x
Most powerful Claude 3 model for complex tasks.

Claude 3 Sonnet

Specifications:
  • Context: 200,000 tokens
  • Input: $3.00 per 1M
  • Output: $15.00 per 1M
  • Token Ratio: 1.1x
Balanced performance and cost for most use cases.

Claude 3 Haiku

Specifications:
  • Context: 200,000 tokens
  • Input: $0.25 per 1M
  • Output: $1.25 per 1M
  • Token Ratio: 1.1x
Fastest and most affordable Claude model.
Claude models typically generate ~10% more tokens than GPT models for the same text due to different tokenization algorithms. Factor this into cost calculations.

Google Models

Gemini 1.5 Series

// models-config.js:173-182
{
  name: 'Gemini 1.5 Pro',
  company: 'Google',
  encoding: 'cl100k_base',  // Approximation
  contextLimit: 2097152,    // 2M tokens!
  inputCost: 1.25,
  outputCost: 5.00,
  tokenRatio: 1.05
}
Standout Feature: Largest context window available - over 2 million tokens!Perfect for:
  • Processing entire codebases
  • Long document analysis
  • Multi-document reasoning

Meta Models

Llama 3.1 Series (Latest)

{
  name: 'Llama 3.1 405B',
  contextLimit: 131072,  // 131K tokens
  inputCost: 2.70,
  outputCost: 2.70,      // Same price for input/output
  tokenRatio: 0.95       // 5% fewer tokens than GPT
}
  • Largest open-source model
  • Competitive with GPT-4
  • Token-efficient (5% fewer tokens)
{
  name: 'Llama 3.1 70B',
  contextLimit: 131072,
  inputCost: 0.35,
  outputCost: 0.40,
  tokenRatio: 0.95
}
  • Best balance of cost and capability
  • 87% cheaper than GPT-4o
  • 131K context window
{
  name: 'Llama 3.1 8B',
  contextLimit: 131072,
  inputCost: 0.055,
  outputCost: 0.055,
  tokenRatio: 0.95
}
  • Lowest cost option
  • Surprisingly capable
  • Same 131K context as larger variants

Llama 3 Series (Previous Generation)

ModelContextInput CostOutput CostBest For
Llama 3 70B8,192$0.70$0.80Legacy applications
Llama 3 8B8,192$0.05$0.05Budget workloads
Llama 3.1 models offer significantly larger context (131K vs 8K) at similar or better pricing. Upgrade if possible.

Mistral AI Models

Mistral Large

128K Context | 2.00/2.00 / 6.00Premier model from Mistral AI
  • European AI provider
  • Strong multilingual support
  • Token ratio: 1.02x

Mistral Nemo

128K Context | 0.15/0.15 / 0.15Fast and affordable
  • Same pricing for input/output
  • Large context window
  • Token ratio: 1.02x

Mixtral 8x7B

32K Context | 0.24/0.24 / 0.24Mixture of Experts architecture
  • Efficient sparse activation
  • Good for diverse tasks
  • Token ratio: 1.02x

Mixtral 8x22B

65K Context | 0.65/0.65 / 0.65Larger MoE model
  • More parameters
  • Better performance
  • Token ratio: 1.02x

Cohere Models

// Cohere Command models - tokenRatio: 0.98 (slightly more efficient)
  • Context: 128,000 tokens
  • Input: $2.50 per 1M
  • Output: $10.00 per 1M
  • Optimized for RAG (Retrieval Augmented Generation)

Specialized Providers

Alibaba (Qwen Models)

// Token ratio: 0.92 (8% more efficient than GPT)
ModelContextInputOutputNotes
Qwen2.5 72B131,072$0.35$0.40Latest version
Qwen2 72B131,072$0.35$0.40Stable release
  • Strong multilingual (especially Chinese)
  • Token efficient
  • Competitive pricing

DeepSeek

// Token ratio: 0.93 (7% more efficient)
  • Context: 131,072 tokens
  • Input: $0.14 per 1M tokens
  • Output: $0.28 per 1M tokens
  • Excellent value proposition
  • Chinese AI research lab

01.AI (Yi Models)

ModelContextInputOutputToken Ratio
Yi Large32,768$0.60$0.600.97
Yi 1.5 34B32,768$0.30$0.300.97
  • Founded by Kai-Fu Lee
  • Competitive performance
  • Mid-tier pricing

Microsoft (Phi Models)

Small but capable models optimized for efficiency:
// Token ratio: 1.03 (3% more tokens than GPT)
ModelContextInputOutput
Phi-3.5 Mini131,072$0.15$0.60
Phi-3 Medium131,072$1.00$1.00
Phi-3 Mini131,072$0.15$0.60
  • Small model size
  • Large context window
  • Good for edge deployment

AI21 Labs (Jamba Models)

Hybrid SSM-Transformer architecture:
// Token ratio: 0.94 (6% more efficient)
ModelContextInputOutput
Jamba 1.5 Large262,144$0.50$0.70
Jamba 1.5 Mini262,144$0.10$0.10
Standout Feature: 256K token context window at competitive pricing!

xAI (Grok Models)

From Elon Musk’s xAI:
// Token ratio: 1.01 (nearly identical to GPT)
ModelContextInputOutput
Grok-2131,072$2.00$10.00
Grok-2 Mini131,072$0.15$0.60

Other Providers

ModelContextInputOutputToken Ratio
Reka Core131,072$10.00$25.000.99
Reka Flash131,072$0.15$0.600.99
ModelContextInputOutputToken Ratio
Titan Text Premier32,000$0.50$1.501.04
Titan Text Express8,000$0.13$0.171.04
ModelContextInputOutputToken Ratio
Llama 3.1 Sonar Large131,072$1.00$1.000.95
Llama 3.1 Sonar Small131,072$0.20$0.200.95
ModelContextInputOutputToken Ratio
Granite 3 8B131,072$0.055$0.0550.96
Granite 3 2B131,072$0.025$0.0250.96
ModelContextInputOutputToken Ratio
Hermes 3 405B131,072$2.70$2.700.95
Hermes 3 70B131,072$0.35$0.400.95
  • Context: 4,096 tokens
  • Input: $0.24 per 1M
  • Output: $0.24 per 1M
  • Token Ratio: 1.06
ModelContextInputOutputToken Ratio
Nemotron 70B131,072$0.35$0.400.98
Nemotron Mini131,072$0.15$0.600.98

Model Selection Guide

High Accuracy Tasks:
  • GPT-4o, Claude 3.5 Sonnet, Claude 3 Opus
  • Gemini 1.5 Pro, Llama 3.1 405B
Cost-Sensitive:
  • GPT-4o Mini, Gemini 1.5 Flash
  • Llama 3.1 8B, DeepSeek V2.5
  • Granite 3 2B (lowest cost)
Large Context:
  • Gemini 1.5 Pro (2M tokens)
  • Jamba 1.5 (256K tokens)
  • Claude 3 family (200K tokens)
Fast Response:
  • GPT-3.5 Turbo, Claude 3 Haiku
  • Mistral Nemo, Gemini 1.5 Flash

Pricing Comparison

Budget Options (< $0.20 per 1M input tokens)

// Sorted by input cost
const budgetModels = [
  { model: 'Granite 3 2B',      input: 0.025 },
  { model: 'Llama 3 8B',        input: 0.05  },
  { model: 'Llama 3.1 8B',      input: 0.055 },
  { model: 'Gemini 1.5 Flash',  input: 0.075 },
  { model: 'Jamba 1.5 Mini',    input: 0.10  },
  { model: 'DeepSeek V2.5',     input: 0.14  },
  { model: 'GPT-4o Mini',       input: 0.15  },
  { model: 'Command R',         input: 0.15  }
];

Premium Options (> $5.00 per 1M input tokens)

const premiumModels = [
  { model: 'Reka Core',        input: 10.00 },
  { model: 'GPT-4 Turbo',      input: 10.00 },
  { model: 'Claude 3 Opus',    input: 15.00 },
  { model: 'GPT-4',            input: 30.00 }
];

Token Ratio Reference

Token ratio indicates how many tokens a model uses compared to GPT (baseline 1.0). Lower is more efficient.
// From models-config.js - tokenRatio values
const tokenRatios = {
  'Most Efficient': {
    'Qwen': 0.92,      // 8% fewer tokens
    'DeepSeek': 0.93,  // 7% fewer tokens
    'Jamba': 0.94,     // 6% fewer tokens
    'Llama': 0.95      // 5% fewer tokens
  },
  'Standard': {
    'GPT': 1.0,        // Baseline
    'Cohere': 0.98,    // 2% fewer tokens
    'NVIDIA': 0.98     // 2% fewer tokens
  },
  'Less Efficient': {
    'Mistral': 1.02,   // 2% more tokens
    'Microsoft': 1.03, // 3% more tokens
    'Amazon': 1.04,    // 4% more tokens
    'Google': 1.05,    // 5% more tokens
    'Claude': 1.1      // 10% more tokens
  }
};

Model Data Structure

All model data comes from models-config.js:
// Complete model configuration structure
const MODELS_DATA = {
  'model-id': {
    name: 'Display Name',
    company: 'Provider Name',
    encoding: 'o200k_base' | 'cl100k_base',
    contextLimit: 128000,          // Maximum tokens
    inputCost: 2.50,              // $ per 1M input tokens
    outputCost: 10.00,            // $ per 1M output tokens
    url: 'https://...',           // Artificial Analysis link
    tokenRatio: 1.0               // Efficiency vs GPT baseline
  }
};

External Resources

Artificial Analysis

Independent benchmarks and detailed model comparisons

OpenAI Tokenizer

Official OpenAI tokenization playground

Tiktoken Library

Open source tokenization library used by this tool

Model Pricing Updates

Track pricing changes across providers
Pricing Note: Model prices are updated regularly but may change. Always verify current pricing with the provider before production use.

Next Steps

How to Use

Learn how to analyze tokens with Tokenizador

Understanding Tokenization

Deep dive into tokenization concepts

Build docs developers (and LLMs) love