Tokens cost money. Tokens are context. Every token matters.
Why Token Efficiency Matters
Cost Impact
For high-volume LLM applications:Why LLMs Care About Tokens
Pricing
LLM APIs charge per token, not per byte. More tokens = higher cost.
Context Window
Limited context window means fewer tokens for conversation history and data.
Latency
More tokens = longer generation time. Fewer tokens = faster responses.
Quality
Compact data leaves more context for actual content, not syntax.
Benchmark Data
From the Codec Benchmark Report:Real-World Token Counts
| Data Shape | JSON Tokens | GLYPH Tokens | Savings | Use Case |
|---|---|---|---|---|
| LLM message | 10 | 6 | 40% | Chat messages |
| Tool call | 26 | 15 | 42% | Function calling |
| Conversation (25 msgs) | 264 | 134 | 49% | Agent memory |
| Search results (25 rows) | 456 | 220 | 52% | RAG responses |
| Search results (50 rows) | 919 | 439 | 52% | Batch retrieval |
| Tool results (50 items) | 562 | 214 | 62% | Batch operations |
Average: 50%+ token savings on real-world data
Data Shape Impact
Token savings scale with data structure complexity:- Flat Objects
- Nested Objects
- Arrays
- Tabular
Simple key-value pairsSavings: 33%
How GLYPH Saves Tokens
Syntax Elimination
Token Savings Breakdown
| Eliminated | JSON | GLYPH | Tokens Saved |
|---|---|---|---|
| Key quotes | "key" | key | 2 per key |
| Commas | , | space | 1 per field |
| Boolean literals | true/false | t/f | 2-3 per bool |
| Null literal | null | _ | 3 per null |
| Colon separators | : (quoted) | = (bare) | Subtle savings |
When Savings Matter Most
High-Value Scenarios
System Prompts
System Prompts
System prompts are sent with every request. Token savings multiply by request count.Example: Tool Definitions46% reduction per tool × 10 tools × 1M requests = massive savings
Conversation History
Conversation History
Conversation history grows over time and consumes context window.25-message conversation:
- JSON: 264 tokens
- GLYPH: 134 tokens
- 49% reduction = 130 tokens freed for new messages
Batch Operations
Batch Operations
Large datasets see compound savings:50 search results:
- JSON: 919 tokens
- GLYPH: 439 tokens
- 52% reduction = 480 tokens saved
- GLYPH tabular: 214 tokens
- 77% reduction vs JSON = 705 tokens saved
Multi-Agent Systems
Multi-Agent Systems
Multiple agents sharing state:Agent trace (50 steps):
- JSON: 15,510 tokens
- GLYPH: 14,656 tokens
- GLYPH+Pool: 8,090 tokens
- 48% reduction with string deduplication
Real-World Examples
Example 1: Tool Call
Example 2: Conversation Turn
From the benchmark report:Using abbreviated keys (
r for role, c for content) amplifies savingsExample 3: Search Results
25 search results:- JSON
- GLYPH (regular)
- GLYPH (tabular)
Token Counting
How to measure token savings in your application:Optimization Tips
Cost-Benefit Analysis
Adoption Costs
Minimal integration effort
- One-line change:
json.dumps()→glyph.json_to_glyph() - Perfect JSON round-trip (no data loss)
- Existing LLMs can read GLYPH with simple prompting
Savings at Scale
Small Application
Small Application
Scenario: 10K LLM calls/month, 200 tokens avg per call
- JSON: 2M tokens/month
- GLYPH: 1M tokens/month (50% reduction)
- Saved: 1M tokens = $0.50/month
Medium Application
Medium Application
Scenario: 1M LLM calls/month, 300 tokens avg per call
- JSON: 300M tokens/month
- GLYPH: 150M tokens/month (50% reduction)
- Saved: 150M tokens = $75/month
Large Application
Large Application
Scenario: 100M LLM calls/month, 500 tokens avg per call
- JSON: 50B tokens/month
- GLYPH: 25B tokens/month (50% reduction)
- Saved: 25B tokens = $12,500/month
Summary
Average Savings
50%+ token reduction on real-world data
Cost Impact
Halve your LLM API costs for data serialization
Context Window
Fit 93% more conversation history in the same window
Latency
Fewer tokens = faster generation and lower costs
Next Steps
Format Reference
Learn GLYPH syntax and type system
Streaming Validation
Save even more tokens with early cancellation
Tabular Mode
Maximize savings for array data
Benchmark Report
Full benchmark methodology and results