Schema class manages table definitions, column types, and data validation rules for your data pipeline. It handles schema evolution, normalization, and ensures data integrity.
Creating a Schema
Schemas are typically created automatically by dlt, but can also be created explicitly:Properties
name
The name of the schema.str - Schema name
Source: ~/workspace/source/dlt/common/schema/schema.py:654
tables
Dictionary of all tables in the schema.TSchemaTables - Dictionary of table schemas
Source: ~/workspace/source/dlt/common/schema/schema.py:658
version
Current version of the schema content, incremented when modified.int - Current schema version
Source: ~/workspace/source/dlt/common/schema/schema.py:606
stored_version
Version of the schema from the time it was loaded/created.int - Stored schema version
Source: ~/workspace/source/dlt/common/schema/schema.py:617
version_hash
Current version hash computed from schema content.str - Schema version hash
Source: ~/workspace/source/dlt/common/schema/schema.py:626
is_modified
Checks if schema was modified since it was saved.bool - True if schema is modified
Source: ~/workspace/source/dlt/common/schema/schema.py:640
is_new
Checks if schema was ever saved.bool - True if schema is new
Source: ~/workspace/source/dlt/common/schema/schema.py:648
settings
Schema settings including default hints and preferred types.TSchemaSettings - Schema settings dictionary
Source: ~/workspace/source/dlt/common/schema/schema.py:715
naming
Naming convention used by the schema to normalize identifiers.NamingConvention - Naming convention instance
Methods
get_table()
Gets a table schema by name.Name of the table to retrieve.
TTableSchema - Table schema dictionary
Raises: TableNotFound - If table doesn’t exist
Source: ~/workspace/source/dlt/common/schema/schema.py:537
get_table_columns()
Gets columns of a table, optionally including incomplete columns.Name of the table.
Whether to include columns without data type.
TTableSchemaColumns - Dictionary of column schemas
Source: ~/workspace/source/dlt/common/schema/schema.py:543
update_table()
Adds or merges a partial table schema into the schema.Table schema to add or merge.
Whether to normalize identifiers using naming convention.
If True, partial_table is treated as a diff to apply directly.
If False, compound properties replace rather than merge.
TPartialTableSchema - The applied partial table
Source: ~/workspace/source/dlt/common/schema/schema.py:350
update_schema()
Updates this schema from an incoming schema.Schema to merge from.
drop_tables()
Drops tables from the schema.List of table names to drop. Must include all nested tables.
List[TTableSchema] - List of dropped table schemas
Source: ~/workspace/source/dlt/common/schema/schema.py:420
data_tables()
Gets list of all data tables (excludes dlt internal tables).Only include tables that have seen data.
Include tables without columns.
List[TTableSchema] - List of data table schemas
Source: ~/workspace/source/dlt/common/schema/schema.py:553
data_table_names()
Returns list of data table names.Only include tables that have seen data.
Include incomplete tables.
List[str] - List of table names
Source: ~/workspace/source/dlt/common/schema/schema.py:570
dlt_tables()
Gets dlt internal tables.List[TTableSchema] - List of dlt table schemas
Source: ~/workspace/source/dlt/common/schema/schema.py:581
clone()
Creates a deep copy of the schema.New name for the cloned schema.
Remove processing markers (x-normalizer, x-loader hints).
Update normalizers and identifiers in cloned schema.
Schema - Cloned schema
Source: ~/workspace/source/dlt/common/schema/schema.py:898
set_schema_contract()
Sets schema contract settings.Contract settings or None to remove.
to_dict()
Converts schema to dictionary representation.Remove default values from output.
Remove processing hints.
Increment version if modified.
TStoredSchema - Schema as dictionary
Source: ~/workspace/source/dlt/common/schema/schema.py:739
to_pretty_json()
Converts schema to formatted JSON string.Remove default values.
Remove processing hints.
str - Pretty-printed JSON
Source: ~/workspace/source/dlt/common/schema/schema.py:774
to_pretty_yaml()
Converts schema to formatted YAML string.Remove default values.
Remove processing hints.
str - Pretty-printed YAML
Source: ~/workspace/source/dlt/common/schema/schema.py:782