Resource management

One advantage of running workloads on Flyte is the ability to request compute resources far beyond what is available locally. Flytekit lets you declare resource requirements close to the task definition, making them explicit and version-controlled.

Requests and limits

Tasks support requests and limits that mirror Kubernetes resource semantics:

Requests are scheduling hints. The Kubernetes scheduler places the pod on a node that has at least the requested resources available.
Limits are hard constraints. A task can be allocated more than it requests (up to the limit), but never more than the limit.

Use the flytekit.Resources class to specify resource requirements:

from flytekit import task, Resources

@task(
    requests=Resources(cpu="2", mem="1Gi"),
    limits=Resources(cpu="4", mem="2Gi"),
)
def count_unique_numbers(data: list) -> int:
    return len(set(data))

Available resource attributes

The Resources class accepts the following attributes. Values use Kubernetes resource units.

Attribute	Unit examples	Description
`cpu`	`"1"`, `"500m"`, `"2"`	CPU cores (m = millicores)
`mem`	`"256Mi"`, `"4Gi"`	Memory in bytes (Mi, Gi, Ti)
`gpu`	`"1"`, `"2"`	Number of GPUs
`storage`	`"10Gi"`	Ephemeral storage

Requesting GPU

To run a task on a GPU node, set the gpu attribute in limits:

from flytekit import task, Resources

@task(
    requests=Resources(cpu="4", mem="8Gi", gpu="1"),
    limits=Resources(cpu="8", mem="16Gi", gpu="1"),
)
def train_model(dataset_path: str) -> str:
    # GPU-accelerated training logic here
    return "/path/to/model"

To prevent regular (non-GPU) tasks from being scheduled on GPU nodes, configure taints and tolerations on your GPU node group. Set up the corresponding tolerations in FlytePropeller’s configuration so that GPU tasks automatically receive the correct toleration.

Full example

from typing import List
from flytekit import task, workflow, Resources

@task(
    requests=Resources(cpu="2", mem="512Mi"),
    limits=Resources(cpu="4", mem="1Gi"),
)
def count_unique_numbers(data: List[int]) -> int:
    return len(set(data))

@task(
    requests=Resources(cpu="1", mem="256Mi"),
    limits=Resources(cpu="2", mem="512Mi"),
)
def square(n: int) -> int:
    return n * n

@workflow
def my_workflow(data: List[int]) -> int:
    unique_count = count_unique_numbers(data=data)
    return square(n=unique_count)

if __name__ == "__main__":
    print(my_workflow(data=[1, 2, 2, 3, 3, 3]))

Overriding resources at runtime

Use the with_overrides method to override resource allocations when calling a task inside a workflow. This is useful when you want the same task function to run with different resources depending on context.

from typing import List
from flytekit import task, workflow, Resources

@task(
    requests=Resources(cpu="1", mem="200Mi"),
    limits=Resources(cpu="2", mem="350Mi"),
)
def count_unique_numbers_1(data: List[int]) -> int:
    return len(set(data))

@task
def square_1(n: int) -> int:
    return n * n

@workflow
def my_pipeline(data: List[int]) -> int:
    unique_count = count_unique_numbers_1(data=data)
    # Override the memory limit to 500Mi for this specific invocation
    return square_1(n=unique_count).with_overrides(
        limits=Resources(cpu="6", mem="500Mi")
    )

if __name__ == "__main__":
    print(my_pipeline(data=[1, 2, 2, 3, 3, 3]))

The with_overrides values are bounded by the platform-level CPU and memory quotas. In the example above, if the platform quota caps CPU at 4 cores, a limit of 6 specified via with_overrides will be silently capped at 4.

Platform-level resource quotas

Flyte administrators can configure maximum resource limits at two levels:

Admin config (flyteadmin)

The flyteadmin configuration defines the maximum CPU and memory a single task can request. Tasks that declare resources exceeding these limits will be rejected at registration time.

# Example flyteadmin config excerpt
task_resource_defaults:
  defaults:
    cpu: "2"
    memory: "500Mi"
    storage: "500Mi"
  limits:
    cpu: "4"
    memory: "1Gi"
    gpu: "1"
    storage: "20Gi"

Namespace-level quota (Kubernetes ResourceQuota)

Each Flyte project-domain pair maps to a Kubernetes namespace. A ResourceQuota on the namespace limits the aggregate CPU and memory that all pods in that namespace can consume simultaneously.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: flytesnacks-development
  namespace: flytesnacks-development
spec:
  hard:
    requests.cpu: "64"
    requests.memory: "128Gi"
    limits.cpu: "128"
    limits.memory: "256Gi"

To modify the default platform resource limits, update the admin config and namespace-level quota in your Helm values.

Resource configuration patterns

Memory-intensive data processing

For tasks that load large datasets into memory, set generous memory requests and limits while keeping CPU modest:

@task(
    requests=Resources(cpu="2", mem="16Gi"),
    limits=Resources(cpu="4", mem="32Gi"),
)
def process_large_dataset(path: str) -> int:
    import pandas as pd
    df = pd.read_parquet(path)
    return len(df)

CPU-bound parallel computation

For tasks that parallelize work across CPU cores, match the CPU request to the number of workers:

@task(
    requests=Resources(cpu="8", mem="4Gi"),
    limits=Resources(cpu="16", mem="8Gi"),
)
def parallel_computation(data: list) -> list:
    from multiprocessing import Pool
    with Pool(processes=8) as pool:
        return pool.map(process_item, data)

GPU model training

For GPU tasks, request and limit GPU count explicitly. Request ample CPU and memory to feed the GPU:

@task(
    requests=Resources(cpu="8", mem="32Gi", gpu="1"),
    limits=Resources(cpu="16", mem="64Gi", gpu="1"),
)
def fine_tune_model(base_model: str, dataset: str) -> str:
    # Training loop here
    return "/output/model"

Basics

Data Types & I/O

Advanced Composition

Productionizing

Flyte Agents

Resource management

Requests and limits

Available resource attributes

Requesting GPU

Full example

Overriding resources at runtime

Platform-level resource quotas

Resource configuration patterns

Build docs developers (and LLMs) love

Basics

Data Types & I/O

Advanced Composition

Productionizing

Flyte Agents

​Requests and limits

​Available resource attributes

​Requesting GPU

​Full example

​Overriding resources at runtime

​Platform-level resource quotas

​Resource configuration patterns

Build docs developers (and LLMs) love

Requests and limits

Available resource attributes

Requesting GPU

Full example

Overriding resources at runtime

Platform-level resource quotas

Resource configuration patterns