Types and Schemas

BAML’s type system enables you to extract structured, validated data from LLMs. Every BAML function has a return type that defines the schema of the output.

Why Types Matter

BAML transforms prompt engineering into schema engineering. Instead of wrestling with string outputs, you define the structure you want and BAML handles:

Generating schema instructions for the LLM
Parsing and validating responses
Type-safe code generation in your target language
Flexible parsing that works even with imperfect LLM outputs

Primitive Types

BAML supports standard primitive types:

bool     // true or false
int      // 42, -10, 0
float    // 3.14, -0.5, 2.0
string   // "hello", "world"
null     // null value

Example

function GetTemperature(city: string) -> float {
  client GPT4o
  prompt #"
    What's the current temperature in {{ city }}?
    {{ ctx.output_format }}
  "#
}

Literal Types

Constrain primitives to specific values:

function ClassifyIssue(description: string) -> "bug" | "enhancement" {
  client GPT4o
  prompt #"
    Classify this issue:
    {{ description }}
    
    {{ ctx.output_format }}
  "#
}

The LLM must return exactly "bug" or "enhancement" - BAML validates this.

Enums

For a fixed set of named values, use enums:

enum Category {
    Refund
    CancelOrder
    TechnicalSupport
    AccountIssue
    Question
}

function ClassifyMessage(input: string) -> Category {
  client GPT4o
  prompt #"
    Classify the following message into ONE of the categories:
    
    {{ ctx.output_format }}
    
    Message: {{ input }}
  "#
}

Enum with Descriptions

Add descriptions to help the LLM choose correctly:

enum Sentiment {
  Positive @description("Customer is happy or satisfied")
  Negative @description("Customer is unhappy or frustrated")
  Neutral @description("No clear emotional tone")
}

Enum Aliases

Map enum values to different string representations:

enum Status {
  Active @alias("active")
  Inactive @alias("inactive")
  Pending @alias("pending_review")
}

BAML will accept any of these aliases when parsing.

Classes

Classes define structured objects with named fields:

class Resume {
  name string
  email string?
  skills string[]
  education Education[]
}

class Education {
  school string
  degree string
  year int
}

Field Modifiers

Optional fields use ?:

class User {
  name string
  email string?  // May be null
  age int?
}

Array fields use []:

class Recipe {
  ingredients string[]
  steps string[]
  tags string[]?
}

Field Attributes

@description: Guide the LLM on what to extract

class Resume {
  skills string[] @description("Only include programming languages")
  education Education[] @description("Extract in the same order listed")
}

@alias: Accept alternative field names

class User {
  full_name string @alias("fullName") @alias("name")
  email_address string @alias("email")
}

Multimodal Types

BAML supports rich media inputs:

Image

function DescribeImage(img: image) -> string {
  client GPT4o
  prompt #"
    {{ _.role("user") }}
    Describe this image in detail:
    {{ img }}
  "#
}

Usage:

from baml_py import Image
from baml_client import b

# From URL
result = b.DescribeImage(
  img=Image.from_url("https://example.com/photo.jpg")
)

# From base64
result = b.DescribeImage(
  img=Image.from_base64("image/png", base64_data)
)

Audio

function TranscribeAudio(audio: audio) -> string {
  client GPT4o
  prompt #"
    Transcribe this audio:
    {{ audio }}
  "#
}

Video

function DescribeVideo(clip: video) -> string {
  client GPT4o
  prompt #"
    Describe what happens in this video:
    {{ clip }}
  "#
}

PDF

function SummarizePDF(document: pdf) -> string {
  client GPT4o
  prompt #"
    Summarize this PDF document:
    {{ document }}
  "#
}

Arrays

Arrays work with any type:

function ExtractEmails(text: string) -> string[] {
  client GPT4o
  prompt #"
    Extract all email addresses from:
    {{ text }}
    {{ ctx.output_format }}
  "#
}

Nested arrays:

class Matrix {
  values float[][]
}

Unions

Return one of multiple types:

class Success {
  result string
}

class Error {
  error_message string
}

function ProcessRequest(input: string) -> Success | Error {
  client GPT4o
  prompt #"
    Process this request: {{ input }}
    {{ ctx.output_format }}
  "#
}

Use unions for:

Tool calling / function selection
Success/error responses
Multiple output formats

Maps/Dictionaries

For key-value pairs:

function ExtractMetadata(text: string) -> map<string, string> {
  client GPT4o
  prompt #"
    Extract metadata as key-value pairs:
    {{ text }}
    {{ ctx.output_format }}
  "#
}

Nested Structures

BAML handles deeply nested schemas:

class Company {
  name string
  departments Department[]
}

class Department {
  name string
  employees Employee[]
  budget float
}

class Employee {
  name string
  role string
  skills string[]
}

function ExtractOrgChart(doc: string) -> Company {
  client GPT4o
  prompt #"
    Extract the organizational structure:
    {{ doc }}
    {{ ctx.output_format }}
  "#
}

BAML’s Schema-Aligned Parsing (SAP) handles complex nested outputs reliably.

Dynamic Types

For types that need to be modified at runtime:

enum Category {
  Technology
  Business
  Science
  @@dynamic
}

class Product {
  name string
  category Category
}

You can add enum values or class fields at runtime. See the Dynamic Types guide.

Type Validation

BAML validates outputs to ensure they match your schema:

Primitive types: Checks type correctness (int vs string, etc.)
Enums: Validates against allowed values
Classes: Verifies all required fields are present
Arrays: Ensures all elements match the expected type
Optionals: Allows null or missing values

Flexible Parsing

BAML’s Schema-Aligned Parsing (SAP) is more forgiving than strict JSON validation:

Handles markdown code blocks around JSON
Accepts chain-of-thought reasoning before the JSON
Tolerates minor formatting issues
Works with models that don’t support native tool calling

This means your schemas work on day one of new model releases, even without official structured output support.

Generated Types

BAML generates idiomatic types in your target language:

Python
TypeScript
Ruby
Go

# Generated as Pydantic models
from baml_client.types import Resume, Education

resume: Resume = b.ExtractResume(text)
print(resume.name)  # Type-safe attribute access
print(resume.education[0].school)

// Generated as TypeScript interfaces
import { Resume, Education } from 'baml_client/types'

const resume: Resume = await b.ExtractResume(text)
console.log(resume.name)  // Type-safe!
console.log(resume.education[0].school)

# Generated as Ruby classes
resume = b.ExtractResume(resume_text: text)
puts resume.name
puts resume.education[0].school

// Generated as Go structs
resume, err := b.ExtractResume(ctx, text, nil)
fmt.Println(resume.Name)
fmt.Println(resume.Education[0].School)

Best Practices

Use descriptions liberally: They guide the LLM and serve as documentation
Make optional what might be missing: Use ? for fields that may not exist
Prefer enums over string literals: When you have a known set of values
Keep schemas focused: Break complex extractions into multiple functions
Test with real data: Use the BAML playground to validate your schemas

Next Steps

Functions

Learn how functions use types

Prompts

Use types in your prompts

Testing

Test your type schemas

Type Reference

Complete type system reference

Get Started

Installation

Core Concepts

Guides

Advanced

Deployment

Types and Schemas

Types and Schemas

Why Types Matter

Primitive Types

Example

Literal Types

Enums

Enum with Descriptions

Enum Aliases

Classes

Field Modifiers

Field Attributes

Multimodal Types

Image

Audio

Video

PDF

Arrays

Unions

Maps/Dictionaries

Nested Structures

Dynamic Types

Type Validation

Flexible Parsing

Generated Types

Best Practices

Next Steps

Functions

Prompts

Testing

Type Reference

Build docs developers (and LLMs) love

Get Started

Installation

Core Concepts

Guides

Advanced

Deployment

​Types and Schemas

​Why Types Matter

​Primitive Types

​Example

​Literal Types

​Enums

​Enum with Descriptions

​Enum Aliases

​Classes

​Field Modifiers

​Field Attributes

​Multimodal Types

​Image

​Audio

​Video

​PDF

​Arrays

​Unions

​Maps/Dictionaries

​Nested Structures

​Dynamic Types

​Type Validation

​Flexible Parsing

​Generated Types

​Best Practices

​Next Steps

Functions

Prompts

Testing

Type Reference

Build docs developers (and LLMs) love

Types and Schemas

Why Types Matter

Primitive Types

Example

Literal Types

Enums

Enum with Descriptions

Enum Aliases

Classes

Field Modifiers

Field Attributes

Multimodal Types

Image

Audio

Video

PDF

Arrays

Unions

Maps/Dictionaries

Nested Structures

Dynamic Types

Type Validation

Flexible Parsing

Generated Types

Best Practices

Next Steps