How Flyte uses storage
| Data type | Storage path | Who writes it |
|---|---|---|
| Workflow metadata (launch plans, executions) | metadataContainer | FlyteAdmin |
| Task input/output literals | userDataContainer | FlyteCopilot sidecar |
| Large datasets (offloaded literals) | userDataContainer | FlytePropeller |
| Cached task outputs | userDataContainer | DataCatalog |
metadataContainer and userDataContainer can point to the same bucket. Using separate buckets allows independent lifecycle policies.
Configuring storage backends
- S3
- GCS
- Azure Blob
- MinIO
Signed URLs
Flyte generates pre-signed URLs for the FlyteConsole to let users download task output files directly from the object store. TheremoteData config controls how these URLs are created:
Cache configuration
DataCatalog uses the object store to cache task output metadata. Configure the in-memory cache size:Download limits
To protect against unexpectedly large task outputs being pulled into FlytePropeller memory, set a download limit:Offline / offloaded literal data
Large task inputs and outputs can be offloaded to the object store rather than stored inline in the workflow CRD. This is recommended for workflows that pass large datasets between tasks:userDataContainer/data/ and stores a reference in the workflow CRD instead of the raw bytes.