Requests and limits
Tasks supportrequests and limits that mirror Kubernetes resource semantics:
- Requests are scheduling hints. The Kubernetes scheduler places the pod on a node that has at least the requested resources available.
- Limits are hard constraints. A task can be allocated more than it requests (up to the limit), but never more than the limit.
flytekit.Resources class to specify resource requirements:
Available resource attributes
TheResources class accepts the following attributes. Values use Kubernetes resource units.
| Attribute | Unit examples | Description |
|---|---|---|
cpu | "1", "500m", "2" | CPU cores (m = millicores) |
mem | "256Mi", "4Gi" | Memory in bytes (Mi, Gi, Ti) |
gpu | "1", "2" | Number of GPUs |
storage | "10Gi" | Ephemeral storage |
Requesting GPU
To run a task on a GPU node, set thegpu attribute in limits:
To prevent regular (non-GPU) tasks from being scheduled on GPU nodes, configure taints and tolerations on your GPU node group. Set up the corresponding tolerations in FlytePropeller’s configuration so that GPU tasks automatically receive the correct toleration.
Full example
Overriding resources at runtime
Use thewith_overrides method to override resource allocations when calling a task inside a workflow. This is useful when you want the same task function to run with different resources depending on context.
Platform-level resource quotas
Flyte administrators can configure maximum resource limits at two levels:Admin config (flyteadmin)
Admin config (flyteadmin)
The
flyteadmin configuration defines the maximum CPU and memory a single task can request. Tasks that declare resources exceeding these limits will be rejected at registration time.Namespace-level quota (Kubernetes ResourceQuota)
Namespace-level quota (Kubernetes ResourceQuota)
Each Flyte project-domain pair maps to a Kubernetes namespace. A
ResourceQuota on the namespace limits the aggregate CPU and memory that all pods in that namespace can consume simultaneously.To modify the default platform resource limits, update the admin config and namespace-level quota in your Helm values.
Resource configuration patterns
Memory-intensive data processing
Memory-intensive data processing
For tasks that load large datasets into memory, set generous memory requests and limits while keeping CPU modest:
CPU-bound parallel computation
CPU-bound parallel computation
For tasks that parallelize work across CPU cores, match the CPU request to the number of workers:
GPU model training
GPU model training
For GPU tasks, request and limit GPU count explicitly. Request ample CPU and memory to feed the GPU: