Skip to main content

Specification ID

Spec ID: glyph-loose-1.0.0
Date: 2026-01-13
Status: Stable (Frozen)
Canonical output for any input in the test corpus is frozen and will not change.

Overview

GLYPH-Loose is the schema-optional subset of GLYPH. It provides a deterministic canonical representation for JSON-like data, suitable for hashing, caching, and cross-language interoperability.

Design Goals

  1. Drop-in JSON replacement - Any valid JSON is valid GLYPH-Loose input
  2. Deterministic canonical form - Same data always produces same output
  3. Cross-language parity - Go, JS, and Python implementations produce identical output
  4. Compact - More token-efficient than JSON for LLM contexts

Canonical Rules

Scalars

TypeCanonical FormExamples
null__ (accepts , null on input)
boolt / ft, f
intDecimal, no leading zeros0, 42, -100
floatShortest roundtrip, e (not E)3.14, 1e-06, 1e+15
stringBare if safe, else quotedhello, "hello world"

Float Formatting

  • Zero: Always 0 (not -0, not 0.0)
  • Negative zero: Canonicalizes to 0
  • Exponent threshold: Use exponential when exp < -4 or exp >= 15
  • Exponent format: 2-digit minimum (1e-06, not 1e-6)
  • NaN/Infinity: Rejected with error (not JSON-compatible)
Examples:
0.000001  →  1e-06
100000000000000  →  1e+14
0.0001  →  0.0001  (no exponent)
10000000000000000  →  1e+16  (exponent)

String Bare-Safe Rule

A string is “bare-safe” (unquoted) if:
  1. Non-empty
  2. First character: Unicode letter or _
  3. Remaining characters: Unicode letter, digit, _, -, ., /
  4. Not a reserved word: t, f, true, false, null, none, nil
Otherwise, the string is quoted with minimal escapes. Examples:
hello       → hello
hello_world → hello_world
hello-2.0   → hello-2.0
file.txt    → file.txt
src/main.go → src/main.go

hello world → "hello world"  (space)
123abc      → "123abc"       (starts with digit)
true        → "true"         (reserved word)

Containers

TypeCanonical Form
list[ + space-separated elements + ]
map{ + sorted key=value pairs + }
Examples:
[]
[1 2 3]
[_ t 42 hello]
{}
{a=1}
{a=1 b=2 c=3}
{name=Alice age=30 active=t}

Key Ordering

Map keys are sorted by bytewise UTF-8 comparison of their canonical string form.
Input:  {"b":1,"a":2,"aa":3,"A":4,"_":5}
Output: {A=4 _=5 a=2 aa=3 b=1}
UTF-8 byte order: A (0x41) < _ (0x5F) < a (0x61) < …

Duplicate Keys

Last-wins policy: When a JSON object has duplicate keys, the last value is used.
Input:  {"k":1,"k":2,"k":3}
Output: {k=3}

Type Mappings

JSON Bridge

Input (JSON → GLYPH)

gv, err := glyph.FromJSONLoose(jsonBytes)
  • Accepts any valid JSON
  • Rejects NaN/Infinity (returns error)
  • Integers within ±2^53 become int, others become float

Output (GLYPH → JSON)

jsonBytes, err := glyph.ToJSONLoose(gv)
  • Produces valid JSON
  • IDs become "^prefix:value" strings
  • Times become ISO-8601 strings
  • Bytes become base64 strings

Extended Mode

With BridgeOpts{Extended: true}:
{"$glyph": "time", "value": "2025-01-13T12:00:00Z"}
{"$glyph": "id", "value": "^user:abc123"}
{"$glyph": "bytes", "base64": "SGVsbG8="}

CLI Usage

# Format JSON as canonical GLYPH-Loose
echo '{"b":1,"a":2}' | glyph fmt-loose
# Output: {a=2 b=1}

# Convert to pretty JSON
echo '{"b":1,"a":2}' | glyph to-json
# Output:
# {
#   "a": 2,
#   "b": 1
# }

# File input
glyph fmt-loose data.json

# LLM mode (ASCII-safe nulls)
echo '{"value":null}' | glyph fmt-loose --llm
# Output: {value=_}

# Compact mode with schema header
echo '{"action":"search","query":"test"}' | glyph fmt-loose --compact
# Output:
# @schema#<hash> keys=[action query]
# {#0=search #1=test}

Schema Extensions

@open Structs

The @open annotation allows a struct type to accept fields not defined in the schema. Schema Definition:
schema := NewSchemaBuilder().
    AddOpenStruct("Config", "v1",
        Field("name", PrimitiveType("str")),
        Field("port", PrimitiveType("int")),
    ).
    Build()
Canonical Schema Text:
Config:v1 @open struct{
    name: str
    port: int
}
Behavior:
Struct TypeUnknown FieldValidation Result
Closed (default)PresentError: unknown_field
@openPresentPass (warning logged)
@open + StrictPresentError: unknown_field

map<K,V> Validation

Map values are validated against the specified value type:
schema := NewSchemaBuilder().
    AddStruct("Config", "v1",
        Field("settings", MapType(PrimitiveType("str"), PrimitiveType("int"))),
    ).
    Build()

// This passes - all values are ints
value := Map(
    MapEntry{Key: "timeout", Value: Int(30)},
    MapEntry{Key: "port", Value: Int(8080)},
)

// This fails - "name" value is string, not int
value := Map(
    MapEntry{Key: "timeout", Value: Int(30)},
    MapEntry{Key: "name", Value: Str("myapp")}, // type_mismatch error
)

Auto-Tabular Mode

Auto-Tabular mode provides compact representation for homogeneous lists of objects.

Syntax

@tab _ [col1 col2 col3]
|val1|val2|val3|
|val4|val5|val6|
@end
  • Header: @tab _ followed by sorted column names in brackets
  • Rows: Pipe-delimited cells, one row per line
  • Footer: @end marker

Enabling Auto-Tabular

// Default: auto-tabular disabled
canonical := glyph.CanonicalizeLoose(value)

// With auto-tabular enabled
canonical := glyph.CanonicalizeLooseTabular(value)

// Custom options
opts := glyph.TabularLooseCanonOpts()
opts.MinRows = 5  // Only tabularize 5+ rows
canonical := glyph.CanonicalizeLooseWithOpts(value, opts)

Eligibility Criteria

A list qualifies for tabular emission when:
  1. Contains ≥ MinRows elements (default: 3)
  2. All elements are maps or structs
  3. Total column count ≤ MaxCols (default: 20)
  4. When AllowMissing=false, all rows must have identical keys

Options Reference

OptionTypeDefaultDescription
AutoTabularbooltrueEnable auto-tabular detection
MinRowsint3Minimum rows to trigger tabular
MaxColsint20Maximum columns allowed
AllowMissingbooltrueAllow rows with missing keys
NullStyleNullStyleunderscoresymbol for ∅, underscore for _
SchemaRefstring""Schema hash/id for @schema header
KeyDict[]stringnilKey dictionary for compact keys
UseCompactKeysboolfalseEmit #N instead of field names

Schema Context & Key Dictionaries

For repeated structured outputs, schema contexts enable significant token savings through compact key encoding.

Schema Directive Format

Inline schema (self-contained):
@schema#abc123 @keys=[action query confidence sources]
{#0=search #1="weather NYC" #2=0.95 #3=[web news]}
External schema ref:
@schema#abc123
{#0=search #1="weather NYC" #2=0.95 #3=[web news]}
Clear active schema:
@schema.clear
{action=search query="weather NYC"}

Usage

// Create schema context
schema := glyph.NewSchemaContext([]string{"role", "content", "tool_calls"})

// Emit with schema
opts := glyph.SchemaLooseCanonOpts(schema)
output := glyph.CanonicalizeLooseWithSchema(value, opts)

// Parse with registry
registry := glyph.NewSchemaRegistry()
parsed, ctx, err := glyph.ParseLoosePayload(input, registry)

Conformance Testing

The test corpus at testdata/loose_json/ contains 50 cases covering:
  • Deep nesting (10-20 levels)
  • Unicode (surrogates, CJK, emoji)
  • Edge numbers (boundaries, precision)
  • Key ordering (stability, unicode)
  • Duplicate keys
  • Reserved words
  • Control characters
Golden files at testdata/loose_json/golden/ anchor expected canonical output. Cross-implementation tests verify Go, JS, and Python produce byte-identical canonical forms.

Upgrade Path

GLYPH-Loose is the foundation. When you need schema features:
  1. Add a schema → enables packed encoding, FID-based parsing
  2. Use @open structs → collect unknown fields safely
  3. Use map<K,V> → validate dynamic keys
  4. Use patches → efficient incremental updates
The canonical form remains stable across modes.

Build docs developers (and LLMs) love