Skip to main content
Tokens cost money. Tokens are context. Every token matters.
GLYPH eliminates JSON’s redundant syntax, achieving 40-60% token reduction on real-world data. This directly reduces LLM costs and frees up context window space.

Why Token Efficiency Matters

Cost Impact

For high-volume LLM applications:
JSON:     264 tokens/call × 1M calls = 264M tokens
GLYPH:    134 tokens/call × 1M calls = 134M tokens

Savings:  130M tokens (49% reduction)

At $0.50 per 1M tokens:
JSON cost:   $132.00
GLYPH cost:  $67.00
Money saved: $65.00 per million calls

Why LLMs Care About Tokens

Pricing

LLM APIs charge per token, not per byte. More tokens = higher cost.

Context Window

Limited context window means fewer tokens for conversation history and data.

Latency

More tokens = longer generation time. Fewer tokens = faster responses.

Quality

Compact data leaves more context for actual content, not syntax.

Benchmark Data

From the Codec Benchmark Report:

Real-World Token Counts

Data ShapeJSON TokensGLYPH TokensSavingsUse Case
LLM message10640%Chat messages
Tool call261542%Function calling
Conversation (25 msgs)26413449%Agent memory
Search results (25 rows)45622052%RAG responses
Search results (50 rows)91943952%Batch retrieval
Tool results (50 items)56221462%Batch operations
Average: 50%+ token savings on real-world data

Data Shape Impact

Token savings scale with data structure complexity:
Simple key-value pairs
JSON (45 tokens):
{"name":"Alice","age":28,"city":"NYC","active":true,"score":94.5}
GLYPH (30 tokens):
{active=t age=28 city=NYC name=Alice score=94.5}
Savings: 33%

How GLYPH Saves Tokens

Syntax Elimination

{
  "action": "search",
  "query": "AI agents",
  "limit": 5
}

Tokens:
{ " action " : " search " , " query " : " AI agents " , " limit " : 5 }
1 2    3    4 5 6   7    8 9 10 11    12 13 14  15     16 17 18  19  20 21 22

= 22 tokens

Token Savings Breakdown

EliminatedJSONGLYPHTokens Saved
Key quotes"key"key2 per key
Commas,space1 per field
Boolean literalstrue/falset/f2-3 per bool
Null literalnull_3 per null
Colon separators: (quoted)= (bare)Subtle savings
Savings multiply across large datasets. A 25-message conversation saves 130 tokens (49%).

When Savings Matter Most

High-Value Scenarios

System prompts are sent with every request. Token savings multiply by request count.Example: Tool Definitions
JSON (180 tokens per tool):
{
  "name": "search",
  "description": "Search the web",
  "parameters": {
    "query": {"type": "string", "required": true},
    "limit": {"type": "integer", "minimum": 1, "maximum": 100}
  }
}
GLYPH (98 tokens per tool):
{name=search description="Search the web" parameters={query={type=string required=t} limit={type=integer minimum=1 maximum=100}}}
46% reduction per tool × 10 tools × 1M requests = massive savings
Conversation history grows over time and consumes context window.25-message conversation:
  • JSON: 264 tokens
  • GLYPH: 134 tokens
  • 49% reduction = 130 tokens freed for new messages
Result: Longer conversations without truncation
Large datasets see compound savings:50 search results:
  • JSON: 919 tokens
  • GLYPH: 439 tokens
  • 52% reduction = 480 tokens saved
With tabular mode:
  • GLYPH tabular: 214 tokens
  • 77% reduction vs JSON = 705 tokens saved
Multiple agents sharing state:Agent trace (50 steps):
  • JSON: 15,510 tokens
  • GLYPH: 14,656 tokens
  • GLYPH+Pool: 8,090 tokens
  • 48% reduction with string deduplication

Real-World Examples

Example 1: Tool Call

{
  "tool": "search",
  "arguments": {
    "query": "GLYPH documentation",
    "max_results": 10,
    "sources": ["web", "docs"]
  }
}
Savings: 14 tokens (33%) Impact: Tool calls are frequent in agent workflows. 1000 tool calls save 14,000 tokens.

Example 2: Conversation Turn

From the benchmark report:
{"messages":[{"role":"user","content":"Hi"},{"role":"assistant","content":"Hello!"}],"model":"gpt-4"}
Savings: 14 tokens (47%)
Using abbreviated keys (r for role, c for content) amplifies savings

Example 3: Search Results

25 search results:
[
  {"id": "doc1", "title": "GLYPH Guide", "score": 0.95, "url": "..."},
  {"id": "doc2", "title": "API Docs", "score": 0.89, "url": "..."},
  ... // 23 more
]
456 tokens

Token Counting

How to measure token savings in your application:
import glyph
import tiktoken

enc = tiktoken.get_encoding("cl100k_base")  # GPT-4 tokenizer

data = {"action": "search", "query": "test", "limit": 5}

# JSON tokens
json_str = json.dumps(data)
json_tokens = len(enc.encode(json_str))

# GLYPH tokens
glyph_str = glyph.json_to_glyph(data)
glyph_tokens = len(enc.encode(glyph_str))

# Calculate savings
savings = (json_tokens - glyph_tokens) / json_tokens * 100
print(f"Savings: {savings:.1f}%")

Optimization Tips

1

Use abbreviated keys

Shorten field names where semantics are clear:
{r=user c="Hi" t=1234567890}  // r=role, c=content, t=timestamp
2

Enable tabular mode

For homogeneous arrays, auto-tabular provides 50-70% savings:
glyph.json_to_glyph(data, tabular=True)
3

Remove redundant fields

Only include fields the LLM actually needs:
{status=active}  // not {status=active created=... updated=... version=...}
4

Use compact types

Prefer t/f over string booleans, integers over float when possible:
{active=t count=42}  // not {active="yes" count=42.0}

Cost-Benefit Analysis

Adoption Costs

Minimal integration effort
  • One-line change: json.dumps()glyph.json_to_glyph()
  • Perfect JSON round-trip (no data loss)
  • Existing LLMs can read GLYPH with simple prompting

Savings at Scale

Scenario: 10K LLM calls/month, 200 tokens avg per call
  • JSON: 2M tokens/month
  • GLYPH: 1M tokens/month (50% reduction)
  • Saved: 1M tokens = $0.50/month
Annual savings: $6
Scenario: 1M LLM calls/month, 300 tokens avg per call
  • JSON: 300M tokens/month
  • GLYPH: 150M tokens/month (50% reduction)
  • Saved: 150M tokens = $75/month
Annual savings: $900
Scenario: 100M LLM calls/month, 500 tokens avg per call
  • JSON: 50B tokens/month
  • GLYPH: 25B tokens/month (50% reduction)
  • Saved: 25B tokens = $12,500/month
Annual savings: $150,000
At scale, token efficiency directly impacts profitability.

Summary

Average Savings

50%+ token reduction on real-world data

Cost Impact

Halve your LLM API costs for data serialization

Context Window

Fit 93% more conversation history in the same window

Latency

Fewer tokens = faster generation and lower costs

Next Steps

Format Reference

Learn GLYPH syntax and type system

Streaming Validation

Save even more tokens with early cancellation

Tabular Mode

Maximize savings for array data

Benchmark Report

Full benchmark methodology and results

Build docs developers (and LLMs) love