Skip to main content
Flyte is a data-aware orchestration platform where types are central to everything. Every task input and output is typed. Flytekit automatically translates Python type hints into Flyte’s internal type system, enabling:
  • Data lineage — track exactly which data produced which outputs
  • Memoization — cache task results and skip re-computation when inputs match
  • Auto parallelization — Flyte can reason about data dependencies and schedule tasks optimally
  • Simplified data access — automatic upload and download between local execution and cloud storage
  • Auto-generated CLI and launch UI — the Flyte console and pyflyte run use type information to generate input forms

Python to Flyte type mapping

Flytekit automatically translates most Python types into Flyte types at compile time. The following table describes the full set of supported mappings:
Python TypeFlyte TypeConversionNotes
intIntegerAutomaticUse Python 3 type hints
floatFloatAutomaticUse Python 3 type hints
strStringAutomaticUse Python 3 type hints
boolBooleanAutomaticUse Python 3 type hints
bytes / bytearrayBinaryNot supportedUse a custom type transformer
complexN/ANot supportedUse a custom type transformer
datetime.timedeltaDurationAutomaticUse Python 3 type hints
datetime.datetimeDatetimeAutomaticUse Python 3 type hints
datetime.dateDatetimeAutomaticUse Python 3 type hints
typing.List[T] / list[T]Collection[T]AutomaticT can be any supported type
typing.Iterator[T]Collection[T]AutomaticT can be any supported type
typing.Dict[str, V] / dict[str, V]Map[str, V]AutomaticV can be any supported type, including nested dicts
dictBinaryAutomaticUses MessagePack serialization since flytekit 1.14
File / os.PathLikeFlyteFileAutomaticDefaults to binary protocol; use FlyteFile["jpg"] to specify format
DirectoryFlyteDirectoryAutomaticUse FlyteDirectory["protocol"] to specify file format
@dataclassBinaryAutomaticAll fields must be annotated; uses MessagePack since flytekit 1.14
np.ndarrayFileAutomaticUse np.ndarray as the type hint
pandas.DataFrameStructuredDatasetAutomaticColumn types are not preserved in the simple form
polars.DataFrameStructuredDatasetAutomaticColumn types are not preserved in the simple form
polars.LazyFrameStructuredDatasetAutomaticColumn types are not preserved in the simple form
pyspark.DataFrameStructuredDatasetRequires flytekitplugins-sparkUse pyspark.DataFrame as the type hint
pydantic.BaseModelBinaryAutomaticPydantic v2 supported in flytekit ≥ 1.14
torch.Tensor / torch.nn.ModuleFileRequires torchDerived types also supported
tf.keras.ModelFileRequires tensorflowDerived types also supported
sklearn.base.BaseEstimatorFileRequires scikit-learnDerived types also supported
User-defined typesAnyCustom transformersFlytePickle is the default fallback

Primitive types

The simplest types map directly. Flytekit reads the Python function signature and generates a typed interface automatically:
from flytekit import task, workflow
import datetime

@task
def compute_stats(
    count: int,
    threshold: float,
    label: str,
    active: bool,
    recorded_at: datetime.datetime,
) -> str:
    return f"{label}: {count} samples above {threshold} at {recorded_at}"

@workflow
def stats_wf() -> str:
    return compute_stats(
        count=42,
        threshold=0.95,
        label="experiment-1",
        active=True,
        recorded_at=datetime.datetime(2024, 1, 15, 9, 0, 0),
    )

Collection types

Use typing.List (or list) and typing.Dict (or dict) to pass collections between tasks. Flyte maps these to Collection[T] and Map[str, V] in the IDL:
import typing
from flytekit import task, workflow

@task
def process_batch(items: typing.List[str], config: typing.Dict[str, float]) -> typing.List[float]:
    threshold = config.get("threshold", 0.5)
    return [float(len(item)) * threshold for item in items]

@workflow
def batch_wf() -> typing.List[float]:
    return process_batch(
        items=["apple", "banana", "cherry"],
        config={"threshold": 1.5},
    )

Dataclasses

When you need to pass multiple values together as a structured unit, use Python’s @dataclass decorator. Since flytekit 1.14, dataclasses are serialized with MessagePack instead of Protobuf struct, which preserves integer types without conversion to float:
from dataclasses import dataclass
from flytekit import task, workflow

@dataclass
class ModelConfig:
    learning_rate: float
    num_epochs: int
    model_name: str

@dataclass
class TrainingResult:
    accuracy: float
    loss: float
    config: ModelConfig

@task
def train_model(config: ModelConfig) -> TrainingResult:
    # Simulate training
    return TrainingResult(
        accuracy=0.92,
        loss=0.08,
        config=config,
    )

@workflow
def training_wf() -> TrainingResult:
    config = ModelConfig(learning_rate=0.001, num_epochs=10, model_name="resnet50")
    return train_model(config=config)
All fields in a dataclass must be annotated with their type. Missing type annotations will cause a runtime error.
If you are using flytekit < 1.11.1, add from dataclasses_json import dataclass_json to your imports and decorate your dataclass with @dataclass_json.

Enum types

Limit acceptable values to a predefined set using Python’s enum.Enum. Flytekit constrains task inputs and outputs to the declared values:
import enum
from flytekit import task, workflow

class CoffeeOrder(enum.Enum):
    ESPRESSO = "espresso"
    LATTE = "latte"
    CAPPUCCINO = "cappuccino"

@task
def brew(order: CoffeeOrder) -> str:
    return f"Brewing {order.value}"

@workflow
def coffee_wf(order: CoffeeOrder = CoffeeOrder.LATTE) -> str:
    return brew(order=order)
Only string values are supported as valid enum values. Flyte treats the first value in the list as the default. Enum types cannot be optional.

How Flyte represents types internally

Flyte’s type system is defined in FlyteIDL, the Protobuf-based interface definition language shared by all Flyte components. The core LiteralType message covers the full type hierarchy:
  • Primitive types (SimpleType): INTEGER, FLOAT, STRING, BOOLEAN, DATETIME, DURATION, BINARY
  • Blob types (BlobType): single-part (SINGLE) for FlyteFile, multi-part (MULTIPART) for FlyteDirectory
  • StructuredDataset (StructuredDatasetType): columns with names and types, plus storage format
  • Collection types: homogeneous lists via collection_type
  • Map types: string-keyed maps via map_value_type
  • Enum types (EnumType): predefined string values
  • Union types (UnionType): tagged unions across multiple LiteralType variants
At runtime, literals carry a hash field used by DataCatalog for memoization. If the hash of inputs matches a previously cached execution, Flyte skips re-running the task entirely.

Explore the type system in depth

FlyteFile

Manage individual remote files as typed task inputs and outputs. Flyte handles upload, download, and blob storage automatically.

FlyteDirectory

Work with entire directories of files — useful for model checkpoints, dataset splits, and multi-file artifacts.

StructuredDataset

Abstract tabular type that converts between pandas, Spark, Arrow, and other dataframe libraries with optional column-level type checking.

Dataclass

Combine multiple values into a single typed structure for passing complex configurations between tasks.

Build docs developers (and LLMs) love