Skip to main content

What are Lexicons?

Lexicons are AT Protocol’s schema definition language, similar to JSON Schema or OpenAPI. They define:
  • Record types - Structure of data stored in repositories
  • API endpoints - XRPC queries, procedures, and subscriptions
  • Validation rules - Type checking, constraints, and formats
Every record and API in AT Protocol is governed by a Lexicon schema.

Why Lexicons?

Interoperability

Lexicons ensure that different implementations of AT Protocol can understand each other’s data and APIs. A post created by one client can be read by any other client.

Type Safety

Schemas provide strong typing and validation, catching errors before they propagate through the network.

Versioning

Lexicons use namespaced identifiers, allowing multiple schemas to coexist and evolve independently.

Documentation

Schemas serve as machine-readable documentation for APIs and data structures.

Lexicon Structure

A Lexicon document is a JSON file with the following structure:
{
  "lexicon": 1,
  "id": "app.bsky.feed.post",
  "defs": {
    "main": {
      "type": "record",
      "description": "Record containing a Bluesky post.",
      "key": "tid",
      "record": {
        "type": "object",
        "required": ["text", "createdAt"],
        "properties": {
          "text": {
            "type": "string",
            "maxLength": 3000,
            "maxGraphemes": 300
          },
          "createdAt": {
            "type": "string",
            "format": "datetime"
          }
        }
      }
    }
  }
}
Key Components:
  • lexicon: Schema version (currently 1)
  • id: Unique identifier in reverse-DNS format
  • defs: Named definitions within the schema
  • main: The primary definition (used when referencing by ID alone)

Lexicon Identifiers

Lexicons use reverse-DNS naming:
com.atproto.repo.createRecord    # AT Protocol core
app.bsky.feed.post               # Bluesky application
com.example.custom.widget        # Your custom schema
Format: {authority}.{name}.{name}...
  • Authority: Domain you control (reversed)
  • Names: Hierarchical categorization

Definition Types

Record Definitions

Define data that can be stored in repositories:
{
  "type": "record",
  "description": "A user profile",
  "key": "literal:self",
  "record": {
    "type": "object",
    "required": ["displayName"],
    "properties": {
      "displayName": {
        "type": "string",
        "maxGraphemes": 64
      },
      "description": {
        "type": "string",
        "maxGraphemes": 256
      },
      "avatar": {
        "type": "blob",
        "accept": ["image/png", "image/jpeg"],
        "maxSize": 1000000
      }
    }
  }
}

Query Definitions

Define read-only API endpoints:
{
  "type": "query",
  "description": "Get a user's profile",
  "parameters": {
    "type": "params",
    "required": ["actor"],
    "properties": {
      "actor": {
        "type": "string",
        "description": "Handle or DID of account"
      }
    }
  },
  "output": {
    "encoding": "application/json",
    "schema": {
      "type": "ref",
      "ref": "#profileView"
    }
  }
}

Procedure Definitions

Define API endpoints that modify state:
{
  "type": "procedure",
  "description": "Create a new record",
  "input": {
    "encoding": "application/json",
    "schema": {
      "type": "object",
      "required": ["collection", "record"],
      "properties": {
        "collection": { "type": "string" },
        "rkey": { "type": "string" },
        "record": { "type": "unknown" }
      }
    }
  },
  "output": {
    "encoding": "application/json",
    "schema": {
      "type": "object",
      "required": ["uri", "cid"],
      "properties": {
        "uri": { "type": "string" },
        "cid": { "type": "string" }
      }
    }
  }
}

Subscription Definitions

Define websocket event streams:
{
  "type": "subscription",
  "description": "Subscribe to repository updates",
  "parameters": {
    "type": "params",
    "properties": {
      "cursor": { "type": "integer" }
    }
  },
  "message": {
    "schema": {
      "type": "union",
      "refs": ["#commit", "#handle", "#tombstone"]
    }
  }
}

Using Lexicons in Code

The @atproto/lexicon package provides validation:
import { Lexicons } from '@atproto/lexicon'

// Create a lexicon collection
const lexicons = new Lexicons()

// Add lexicon documents
lexicons.add({
  lexicon: 1,
  id: 'com.example.post',
  defs: {
    main: {
      type: 'record',
      record: {
        type: 'object',
        required: ['text', 'createdAt'],
        properties: {
          text: { 
            type: 'string',
            maxLength: 300 
          },
          createdAt: { 
            type: 'string',
            format: 'datetime' 
          }
        }
      }
    }
  }
})

Validating Records

// Validate a record
try {
  lexicons.assertValidRecord(
    'com.example.post',
    {
      $type: 'com.example.post',
      text: 'Hello world!',
      createdAt: new Date().toISOString()
    }
  )
  console.log('✓ Record is valid')
} catch (error) {
  console.error('Validation failed:', error.message)
}

Validating XRPC Calls

// Validate query parameters
lexicons.assertValidXrpcParams(
  'app.bsky.feed.getTimeline',
  { 
    limit: 50,
    cursor: 'abc123' 
  }
)

// Validate procedure input
lexicons.assertValidXrpcInput(
  'com.atproto.repo.createRecord',
  {
    collection: 'app.bsky.feed.post',
    record: {
      $type: 'app.bsky.feed.post',
      text: 'Hello!',
      createdAt: new Date().toISOString()
    }
  }
)

// Validate query output
lexicons.assertValidXrpcOutput(
  'app.bsky.feed.getTimeline',
  {
    feed: [...]
  }
)

Data Types

Lexicons support rich data types:

Primitive Types

{
  "type": "string",
  "type": "integer",
  "type": "boolean",
  "type": "number",  // float
  "type": "unknown" // any value
}

String Formats

{
  "type": "string",
  "format": "datetime",     // ISO 8601 datetime
  "format": "uri",          // URI
  "format": "at-uri",       // AT Protocol URI
  "format": "did",          // DID identifier
  "format": "handle",       // Handle
  "format": "at-identifier", // Handle or DID
  "format": "nsid",         // Namespaced ID
  "format": "cid",          // Content ID
  "format": "language"      // BCP 47 language tag
}

String Constraints

{
  "type": "string",
  "minLength": 1,
  "maxLength": 300,       // UTF-8 bytes
  "maxGraphemes": 64,     // Unicode graphemes
  "enum": ["public", "private", "unlisted"]
}

Numeric Constraints

{
  "type": "integer",
  "minimum": 1,
  "maximum": 100,
  "enum": [1, 2, 5, 10]
}

Arrays

{
  "type": "array",
  "items": { 
    "type": "string" 
  },
  "minLength": 1,
  "maxLength": 10
}

Objects

{
  "type": "object",
  "required": ["name"],
  "properties": {
    "name": { "type": "string" },
    "age": { "type": "integer" }
  }
}

Blobs

For binary data like images:
{
  "type": "blob",
  "accept": ["image/png", "image/jpeg"],
  "maxSize": 1000000  // bytes
}

References

Reference other definitions:
{
  "type": "ref",
  "ref": "app.bsky.actor.defs#profileView"
}

Unions

One of several types:
{
  "type": "union",
  "refs": [
    "app.bsky.embed.images",
    "app.bsky.embed.video",
    "app.bsky.embed.external"
  ]
}

Real-World Example: Post Record

Here’s the actual Lexicon for a Bluesky post:
{
  "lexicon": 1,
  "id": "app.bsky.feed.post",
  "defs": {
    "main": {
      "type": "record",
      "description": "Record containing a Bluesky post.",
      "key": "tid",
      "record": {
        "type": "object",
        "required": ["text", "createdAt"],
        "properties": {
          "text": {
            "type": "string",
            "maxLength": 3000,
            "maxGraphemes": 300,
            "description": "The primary post content."
          },
          "facets": {
            "type": "array",
            "description": "Annotations of text (mentions, URLs, hashtags)",
            "items": { 
              "type": "ref", 
              "ref": "app.bsky.richtext.facet" 
            }
          },
          "reply": { 
            "type": "ref", 
            "ref": "#replyRef" 
          },
          "embed": {
            "type": "union",
            "refs": [
              "app.bsky.embed.images",
              "app.bsky.embed.video",
              "app.bsky.embed.external",
              "app.bsky.embed.record"
            ]
          },
          "langs": {
            "type": "array",
            "maxLength": 3,
            "items": { 
              "type": "string", 
              "format": "language" 
            }
          },
          "createdAt": {
            "type": "string",
            "format": "datetime"
          }
        }
      }
    },
    "replyRef": {
      "type": "object",
      "required": ["root", "parent"],
      "properties": {
        "root": { 
          "type": "ref", 
          "ref": "com.atproto.repo.strongRef" 
        },
        "parent": { 
          "type": "ref", 
          "ref": "com.atproto.repo.strongRef" 
        }
      }
    }
  }
}
Usage:
import { lexicons } from '@atproto/api'

// Validate post before creating
lexicons.assertValidRecord(
  'app.bsky.feed.post',
  {
    $type: 'app.bsky.feed.post',
    text: 'Check out this amazing AT Protocol feature!',
    langs: ['en'],
    createdAt: new Date().toISOString()
  }
)

Managing Lexicons

Adding Lexicons

import { Lexicons } from '@atproto/lexicon'

const lex = new Lexicons()

// Add individual lexicon
lex.add(lexiconDoc)

// Add multiple lexicons
for (const doc of lexiconDocs) {
  lex.add(doc)
}

Retrieving Lexicons

// Get entire lexicon document
const doc = lex.get('app.bsky.feed.post')

// Get specific definition
const def = lex.getDef('app.bsky.feed.post#replyRef')

// Get with type checking
const recordDef = lex.getDefOrThrow(
  'app.bsky.feed.post',
  ['record']  // Ensure it's a record type
)

Removing Lexicons

lex.remove('com.example.old.schema')

Validation Results

For non-throwing validation:
const result = lex.validate(
  'app.bsky.feed.post',
  recordData
)

if (!result.success) {
  console.error('Validation errors:', result.error)
}

Best Practices

Always use reverse-DNS naming to avoid collisions. Use a domain you control.
Validate data as soon as it enters your system to catch errors early.
Use constraints like maxLength, maxGraphemes, and required to enforce data quality.
Define common types once and reference them to maintain consistency.
Include clear description fields - they serve as API documentation.

Error Handling

import { 
  ValidationError,
  LexiconDefNotFoundError,
  InvalidLexiconError 
} from '@atproto/lexicon'

try {
  lexicons.assertValidRecord('app.bsky.feed.post', data)
} catch (error) {
  if (error instanceof ValidationError) {
    console.error('Invalid data:', error.message)
  } else if (error instanceof LexiconDefNotFoundError) {
    console.error('Schema not found:', error.message)
  } else {
    console.error('Unexpected error:', error)
  }
}

Additional Resources

@atproto/lexicon Package

NPM package documentation

Lexicon Specification

Official Lexicon language specification

Lexicon Repository

Browse official AT Protocol and Bluesky lexicons

Schema Validator

Online Lexicon validator tool

Build docs developers (and LLMs) love