Skip to main content
Flyte uses object storage for all task inputs, outputs, intermediate data, and workflow metadata. Storage is configured via the stow library, which provides a unified interface over multiple cloud backends.

How Flyte uses storage

Data typeStorage pathWho writes it
Workflow metadata (launch plans, executions)metadataContainerFlyteAdmin
Task input/output literalsuserDataContainerFlyteCopilot sidecar
Large datasets (offloaded literals)userDataContainerFlytePropeller
Cached task outputsuserDataContainerDataCatalog
The metadataContainer and userDataContainer can point to the same bucket. Using separate buckets allows independent lifecycle policies.

Configuring storage backends

configuration:
  storage:
    metadataContainer: my-flyte-metadata
    userDataContainer: my-flyte-userdata
    provider: s3
    providerConfig:
      s3:
        region: "us-east-1"
        authType: "iam"    # Uses pod IAM role / IRSA — no static keys

S3 with static access keys

configuration:
  storage:
    metadataContainer: my-flyte-metadata
    userDataContainer: my-flyte-userdata
    provider: s3
    providerConfig:
      s3:
        region: "us-east-1"
        authType: "accesskey"
        accessKey: "<ACCESS_KEY_ID>"
        secretKey: "<SECRET_ACCESS_KEY>"

Inline stow config (for advanced options)

configuration:
  inline:
    storage:
      type: stow
      stow:
        kind: s3
        config:
          region: us-east-1
          auth_type: iam
      container: my-flyte-bucket
      limits:
        maxDownloadMBs: 1000

Signed URLs

Flyte generates pre-signed URLs for the FlyteConsole to let users download task output files directly from the object store. The remoteData config controls how these URLs are created:
configuration:
  inline:
    remoteData:
      region: us-east-1
      scheme: aws      # aws, gcs, or azure
      signedUrls:
        durationMinutes: 3

Cache configuration

DataCatalog uses the object store to cache task output metadata. Configure the in-memory cache size:
configuration:
  inline:
    storage:
      cache:
        max_size_mbs: 10
        target_gc_percent: 100

Download limits

To protect against unexpectedly large task outputs being pulled into FlytePropeller memory, set a download limit:
configuration:
  inline:
    storage:
      limits:
        maxDownloadMBs: 1000

Offline / offloaded literal data

Large task inputs and outputs can be offloaded to the object store rather than stored inline in the workflow CRD. This is recommended for workflows that pass large datasets between tasks:
configuration:
  inline:
    propeller:
      literal-offloading-config:
        enabled: true
With offloading enabled, FlytePropeller writes large literals to userDataContainer/data/ and stores a reference in the workflow CRD instead of the raw bytes.

Build docs developers (and LLMs) love