Overview
The@dlt.transformer decorator is a specialized form of @dlt.resource that takes input from other resources via the data_from argument in order to enrich or transform the data. Transformers enable you to build multi-step data pipelines where one resource processes the output of another.
Signature
Parameters
A function taking minimum one argument of
TDataItems type which will receive data yielded from data_from resource. The function can also accept a meta argument to receive metadata associated with the data item.A resource that will send data to the decorated function. Can be specified when the transformer is created or bound later using the
| operator.A name of the resource that by default also becomes the name of the table to which the data is loaded. If not present, the name of the decorated function will be used.
A table name, if different from
name. This argument also accepts a callable that is used to dynamically create tables for stream-like resources yielding many datatypes.A schema hint that sets the maximum depth of nested table above which the remaining nodes are loaded as structs or JSON.
Controls how to write data to a table. Options:
append, replace, skip, or merge. This argument also accepts a callable that is used to dynamically create tables for stream-like resources yielding many datatypes.A list, dict or pydantic model of column schemas. Typed dictionary describing column names, data types, write disposition and performance hints that gives you full control over the created table schema. This argument also accepts a callable that is used to dynamically create tables for stream-like resources yielding many datatypes.
A column name or a list of column names that comprise a primary key. Typically used with “merge” write disposition to deduplicate loaded data. This argument also accepts a callable that is used to dynamically create tables for stream-like resources yielding many datatypes.
A column name or a list of column names that define a merge key. Typically used with “merge” write disposition to remove overlapping data ranges. This argument also accepts a callable that is used to dynamically create tables for stream-like resources yielding many datatypes.
Schema contract settings that will be applied to this resource.
Defines the storage format of the table. Currently only “iceberg” is supported on Athena, and “delta” on the filesystem. Other destinations ignore this hint.
Format of the file in which resource data is stored. Useful when importing external files. Use
preferred to force a file format that is preferred by the destination used.A list of references to other table’s columns in the form:
Hints for nested tables created by this resource.
When
True, dlt pipeline will extract and load this resource. If False, the resource will be ignored.A specification of configuration and secret values required by the transformer.
When
True, the resource will be loaded in parallel with other resources.Configuration section that comes right after ‘sources’ in default layout. If not present, the current python module name will be used. Default layout is
sources.<section>.<name>.<key_name>.Deprecated. Past functionality got merged into regular resource.
Returns
A
DltResource instance which may be loaded, iterated or combined with other resources into a pipeline.Usage Examples
Basic Transformer with Pipe Operator
Early Binding with data_from
Transformer with Metadata
Transformer with Configuration
Chaining Multiple Transformers
Transformer that Filters Data
Transformer with Merge Disposition
Transformer that Expands Data
Key Differences from @dlt.resource
- Data Input: Transformers receive data from other resources via the first argument, while regular resources generate their own data
- Function Signature: Transformer functions must accept at least one argument to receive input data items
- Binding: Transformers can be bound to resources either early (via
data_fromparameter) or late (via|operator) - Use Case: Transformers are designed for data enrichment and transformation workflows