What is TOON?
Token-Oriented Object Notation (TOON) is a compact, human-readable encoding of the JSON data model that minimizes tokens and makes structure easy for models to follow. It’s intended for LLM input as a drop-in, lossless representation of your existing JSON.The Problem: JSON is Token-Expensive
AI is becoming cheaper and more accessible, but LLM tokens still cost money. Standard JSON is verbose and repeats field names for every record:"id":, "name":, and "role": 100 times. This redundancy adds up quickly in larger datasets.
The TOON Solution: Declare Once, Stream Compact
TOON eliminates this redundancy by declaring structure once and streaming data as rows:Field Declaration
The
{id,name,role} header declares field names once for the entire arrayLength Validation
The
[2] declares array length, helping LLMs detect truncationCompact Rows
Each row is just comma-separated values—no repeated keys
CSV-like Efficiency
Achieves CSV-like compactness while remaining fully lossless JSON
How TOON Combines Formats
TOON strategically combines the best aspects of multiple formats:YAML-Style Indentation for Objects
YAML-Style Indentation for Objects
Nested objects use indentation instead of braces, eliminating Keys are followed by
{ and } tokens:: (colon-space), and values can be unquoted when safe.CSV-Style Tables for Uniform Arrays
CSV-Style Tables for Uniform Arrays
Arrays of objects with identical fields collapse into tabular format:This achieves CSV-like compactness while adding explicit structure that improves LLM reliability.
Inline Format for Primitive Arrays
Inline Format for Primitive Arrays
Simple arrays of strings, numbers, or booleans render inline:The
[3] length declaration provides validation, and values are comma-separated by default.Real-World Example
Here’s how TOON handles mixed data structures:55% token reduction: TOON uses 106 tokens vs JSON’s 235 tokens—a savings of 129 tokens (55%) for this example. The format automatically chooses the most efficient representation for each data structure.
Key Features
Token-Efficient & Accurate
TOON reaches 76.4% accuracy (vs JSON’s 75.0%) while using ~40% fewer tokens in mixed-structure benchmarks across 4 models
JSON Data Model
Encodes the same objects, arrays, and primitives as JSON with deterministic, lossless round-trips
LLM-Friendly Guardrails
Explicit
[N] lengths and {fields} headers give models a clear schema to follow, improving parsing reliabilityMinimal Syntax
Uses indentation instead of braces and minimizes quoting, giving YAML-like readability with CSV-style compactness
Tabular Arrays
Uniform arrays of objects collapse into tables that declare fields once and stream row values line by line
Multi-Language Ecosystem
Spec-driven implementations in TypeScript, Python, Go, Rust, .NET, and other languages
Design Philosophy
TOON’s design prioritizes specific goals:- Make uniform arrays compact — Declare structure once, stream data efficiently
- Stay fully lossless — Round-trips preserve all data:
decode(encode(x)) === x - Keep parsing simple — Explicit structure markers help both LLMs and humans
- Provide validation guardrails — Array lengths and field counts detect truncation and malformed output
How It Works
TOON automatically detects array structure and chooses the most efficient representation:- Detects that
usersis an array of objects - Verifies all objects have identical fields (
id,name,role) - Confirms all values are primitives (no nested objects/arrays)
- Chooses tabular format and generates the header
- Streams each object as a comma-separated row
Lossless round-trips: Decoding TOON back to JSON recovers the exact original structure. See Format Overview for complete syntax details.
What Makes TOON Different?
vs JSON
~40% fewer tokens for mixed-structure data. TOON eliminates redundant field names in arrays while staying lossless.
vs YAML
More compact for arrays. YAML uses
- key: value for each field. TOON uses tabular format to eliminate repetition.vs CSV
Supports nested structures. CSV only handles flat tables. TOON adds minimal overhead (~5-10%) while supporting full JSON.
vs XML
Dramatically more compact. XML’s verbose tags make it the least token-efficient format for most data structures.
Media Type & File Extension
By convention, TOON files use:- File extension:
.toon - Media type:
text/toon - Encoding: UTF-8 (always)
text/toon; charset=utf-8) but defaults to UTF-8 when omitted.
Next Steps
Format Overview
Learn the complete TOON syntax with detailed examples
When to Use TOON
Understand when TOON excels and when to use alternatives
Quick Start
Install the library and start encoding your first dataset
Benchmarks
See detailed performance comparisons across formats
