Skip to main content

Overview

GS1 (GLYPH Stream v1) is a stream framing protocol for transporting a sequence of GLYPH payloads over streaming transports.
Spec ID: gs1-1.0.0
Status: Frozen (v1.0)
Date: 2025-06-20

What GS1 Provides

Stream Multiplexing

Multiple logical streams over one connection (SID-based routing)

Sequence Tracking

Monotonic sequence numbers per stream for ordering and gap detection

Integrity Checking

CRC-32 checksums to detect transmission corruption

State Verification

SHA-256 base hashes for safe patch application

Supported Transports

  • TCP sockets
  • WebSocket (text and binary frames)
  • Server-Sent Events (SSE)
  • Unix pipes
  • Files (log replay, batch processing)

Relationship to GLYPH

GS1 does not modify GLYPH syntax or canonicalization.
A GS1 frame contains:
  • A GS1 header (stream metadata)
  • A payload that is valid GLYPH text (UTF-8 bytes)
Canonicalization, schema validation, and patch semantics are properties of the payload, not GS1. Key principle: A GS1 implementation MUST NOT require changes to the GLYPH parser. The GS1 reader is a separate layer that outputs payloadBytes to the normal GLYPH decoder.

Terminology

TermDefinition
FrameOne message in the stream (header + payload)
SIDStream identifier (multiplex key)
SEQPer-SID sequence number (monotonic)
KINDSemantic category of the payload
BASEOptional state hash for patch safety
CRCOptional CRC-32 checksum for integrity

Frame Kinds

ValueNameMeaning
0docSnapshot or general GLYPH document/value
1patchGLYPH patch doc (@patch ... @end)
2rowSingle row value (streaming tabular data)
3uiUI event value (progress/log/artifact refs)
4ackAcknowledgement (usually no payload)
5errError event (payload describes error)
6pingKeepalive / liveness check
7pongPing response
Implementations MUST accept unknown kinds and surface them as unknown(<byte>).

GS1-T (Text Framing)

GS1-T is the text-based wire format, suitable for SSE, WebSocket text frames, logs, and debugging.

Frame Structure

@frame{key=value key=value ...}\n
<exactly len bytes of payload>\n

Header Grammar

The header line starts with @frame{ and ends with }\n. Inside {} is a space-separated or comma-separated list of key=value pairs. Required keys:
KeyTypeDescription
vuint8Protocol version (MUST be 1)
siduint64Stream identifier
sequint64Sequence number (per-SID, monotonic)
kindstring/uint8Frame kind (name or number)
lenuint32Payload length in bytes
Optional keys:
KeyTypeDescription
crcstringCRC-32 of payload: crc32:<8hex> or <8hex>
basestringState hash: sha256:<64hex>
finalboolEnd-of-stream marker for this SID
flagsuint8Bitmask (hex)

Payload Reading Rule (Critical)

Receiver MUST read payload as raw bytes using len.Receiver MUST NOT parse payload boundaries using delimiters.
This is critical for correctness. The payload may contain @frame{ sequences or other patterns that look like frame boundaries. Always read exactly len bytes.

Example Frames

# Minimal frame
@frame{v=1 sid=0 seq=0 kind=doc len=2}
{}

# Document frame with CRC
@frame{v=1 sid=1 seq=0 kind=doc len=42 crc=a1b2c3d4}
Match{home=Arsenal away=Liverpool score=[2 1]}

# Patch frame with base hash
@frame{v=1 sid=1 seq=1 kind=patch len=18 base=sha256:abc123...}
@patch
= score [3 1]
@end

# UI progress event
@frame{v=1 sid=1 seq=2 kind=ui len=28}
Progress{pct=0.75 msg="processing"}

# Acknowledgement (no payload)
@frame{v=1 sid=1 seq=10 kind=ack len=0}

CRC-32 Integrity Checking

When crc is present:
  • Algorithm: CRC-32 IEEE (polynomial 0xEDB88320)
  • Input: Payload bytes as transmitted
  • Format in GS1-T: crc=<8 lowercase hex digits> or crc=crc32:<8hex>
Receiver MUST verify CRC if present and reject frame on mismatch.

Example: Computing CRC-32

import zlib

payload = b"Match{home=Arsenal away=Liverpool score=[2 1]}"
crc = zlib.crc32(payload) & 0xffffffff
crc_hex = f"{crc:08x}"
print(f"crc={crc_hex}")  # crc=a1b2c3d4
CRC-32 is not cryptographic. It only detects accidental corruption. Use TLS for transport security.

SHA-256 Base Hash for Patch Safety

When base is present:
  • Algorithm: SHA-256
  • Input: CanonicalizeStrict(stateDoc) or CanonicalizeLoose(stateDoc)
  • Format in GS1-T: base=sha256:<64 lowercase hex digits>

State Hash Definition

base = sha256( Canonicalize(stateDoc) )
Sender and receiver MUST agree on canonicalization mode (Strict vs Loose).

Patch Application Rule

For kind=patch frames with base:
  • Receiver MUST NOT apply patch if receiverStateHash != base
  • On mismatch, receiver SHOULD:
    • Request a doc snapshot, OR
    • Emit an err frame, OR
    • Emit an ack with failure payload

Example: Safe Patch Application

from glyph import stream
import hashlib

# Sender: compute base hash
current_state = {"count": 5, "status": "active"}
canonical = glyph.emit(glyph.from_json(current_state))
base_hash = hashlib.sha256(canonical.encode()).hexdigest()

# Send patch with base
writer.write_frame(
    sid=1,
    seq=5,
    kind="patch",
    payload=patch_bytes,
    base=f"sha256:{base_hash}",
)

# Receiver: verify before applying
@handler.on_patch
def handle_patch(sid, seq, payload, state):
    # Base hash already verified by handler
    patch = glyph.parse_patch(payload)
    new_state = apply_patch(state.value, patch)
    handler.cursor.set_state(sid, new_state)
    return new_state

@handler.on_base_mismatch
def handle_mismatch(sid, frame):
    logger.warning(f"State mismatch on {sid}")
    request_full_state(sid)

Ordering and Acknowledgement

SEQ Monotonicity

For each sid:
  • seq MUST be monotonically increasing by 1
  • Receivers SHOULD detect gaps and handle appropriately

ACK Frames

  • kind=ack acknowledges receipt of (sid, seq)
  • ack frames typically have len=0
  • ack with payload may carry error/status details

FINAL Flag

  • final=true indicates no more frames for this sid
  • Receiver may clean up per-SID state

Stream Multiplexing

Use stream IDs (SID) to multiplex multiple logical streams over one connection.

Example: Multi-Agent Coordination

# Coordinator assigns SIDs
PLANNER_SID = 1
EXECUTOR_SID = 2
CRITIC_SID = 3

# Planner sends task to executor
writer.write_frame(
    sid=EXECUTOR_SID,
    seq=0,
    kind="doc",
    payload=glyph.emit(glyph.struct("Task",
        action="search",
        query="latest AI news",
    ))
)

# Executor sends result back to planner
writer.write_frame(
    sid=PLANNER_SID,
    seq=0,
    kind="doc",
    payload=glyph.emit(glyph.struct("Result",
        task_id=1,
        status="complete",
        data=search_results,
    ))
)

# Critic sends feedback to executor
writer.write_frame(
    sid=EXECUTOR_SID,
    seq=1,
    kind="doc",
    payload=glyph.emit(glyph.struct("Feedback",
        task_id=1,
        score=0.8,
        suggestion="Include source URLs",
    ))
)

Security Considerations

  • CRC-32 is not cryptographic - it only detects accidental corruption
  • base hash prevents accidental state drift but is not authentication
  • Implementations MUST enforce maximum len (recommended: 64 MiB)
  • Use TLS for transport security - GS1 does not provide encryption

Conformance Checklist

A GS1 implementation is conformant if it:
  1. ✅ Correctly reads/writes GS1-T frames per this spec
  2. ✅ Enforces len limits
  3. ✅ Verifies CRC-32 when present
  4. ✅ Parses base hash correctly
  5. ✅ Exposes (sid, seq, kind, payloadBytes, base?, crc?) to caller
  6. ✅ Does not require GLYPH parser changes
  7. ✅ Does not treat GS1 headers as part of GLYPH canonicalization

Next Steps

Patches

Learn how to create and apply incremental state updates

Fingerprinting

Understand SHA-256 hashing for state verification

Loose Mode

Explore schema-optional GLYPH for flexible data

Agent Patterns

See GS1 streaming in multi-agent systems

Build docs developers (and LLMs) love