Skip to main content
One advantage of running workloads on Flyte is the ability to request compute resources far beyond what is available locally. Flytekit lets you declare resource requirements close to the task definition, making them explicit and version-controlled.

Requests and limits

Tasks support requests and limits that mirror Kubernetes resource semantics:
  • Requests are scheduling hints. The Kubernetes scheduler places the pod on a node that has at least the requested resources available.
  • Limits are hard constraints. A task can be allocated more than it requests (up to the limit), but never more than the limit.
Use the flytekit.Resources class to specify resource requirements:
from flytekit import task, Resources

@task(
    requests=Resources(cpu="2", mem="1Gi"),
    limits=Resources(cpu="4", mem="2Gi"),
)
def count_unique_numbers(data: list) -> int:
    return len(set(data))

Available resource attributes

The Resources class accepts the following attributes. Values use Kubernetes resource units.
AttributeUnit examplesDescription
cpu"1", "500m", "2"CPU cores (m = millicores)
mem"256Mi", "4Gi"Memory in bytes (Mi, Gi, Ti)
gpu"1", "2"Number of GPUs
storage"10Gi"Ephemeral storage

Requesting GPU

To run a task on a GPU node, set the gpu attribute in limits:
from flytekit import task, Resources

@task(
    requests=Resources(cpu="4", mem="8Gi", gpu="1"),
    limits=Resources(cpu="8", mem="16Gi", gpu="1"),
)
def train_model(dataset_path: str) -> str:
    # GPU-accelerated training logic here
    return "/path/to/model"
To prevent regular (non-GPU) tasks from being scheduled on GPU nodes, configure taints and tolerations on your GPU node group. Set up the corresponding tolerations in FlytePropeller’s configuration so that GPU tasks automatically receive the correct toleration.

Full example

from typing import List
from flytekit import task, workflow, Resources

@task(
    requests=Resources(cpu="2", mem="512Mi"),
    limits=Resources(cpu="4", mem="1Gi"),
)
def count_unique_numbers(data: List[int]) -> int:
    return len(set(data))

@task(
    requests=Resources(cpu="1", mem="256Mi"),
    limits=Resources(cpu="2", mem="512Mi"),
)
def square(n: int) -> int:
    return n * n

@workflow
def my_workflow(data: List[int]) -> int:
    unique_count = count_unique_numbers(data=data)
    return square(n=unique_count)

if __name__ == "__main__":
    print(my_workflow(data=[1, 2, 2, 3, 3, 3]))

Overriding resources at runtime

Use the with_overrides method to override resource allocations when calling a task inside a workflow. This is useful when you want the same task function to run with different resources depending on context.
from typing import List
from flytekit import task, workflow, Resources

@task(
    requests=Resources(cpu="1", mem="200Mi"),
    limits=Resources(cpu="2", mem="350Mi"),
)
def count_unique_numbers_1(data: List[int]) -> int:
    return len(set(data))

@task
def square_1(n: int) -> int:
    return n * n

@workflow
def my_pipeline(data: List[int]) -> int:
    unique_count = count_unique_numbers_1(data=data)
    # Override the memory limit to 500Mi for this specific invocation
    return square_1(n=unique_count).with_overrides(
        limits=Resources(cpu="6", mem="500Mi")
    )

if __name__ == "__main__":
    print(my_pipeline(data=[1, 2, 2, 3, 3, 3]))
The with_overrides values are bounded by the platform-level CPU and memory quotas. In the example above, if the platform quota caps CPU at 4 cores, a limit of 6 specified via with_overrides will be silently capped at 4.

Platform-level resource quotas

Flyte administrators can configure maximum resource limits at two levels:
The flyteadmin configuration defines the maximum CPU and memory a single task can request. Tasks that declare resources exceeding these limits will be rejected at registration time.
# Example flyteadmin config excerpt
task_resource_defaults:
  defaults:
    cpu: "2"
    memory: "500Mi"
    storage: "500Mi"
  limits:
    cpu: "4"
    memory: "1Gi"
    gpu: "1"
    storage: "20Gi"
Each Flyte project-domain pair maps to a Kubernetes namespace. A ResourceQuota on the namespace limits the aggregate CPU and memory that all pods in that namespace can consume simultaneously.
apiVersion: v1
kind: ResourceQuota
metadata:
  name: flytesnacks-development
  namespace: flytesnacks-development
spec:
  hard:
    requests.cpu: "64"
    requests.memory: "128Gi"
    limits.cpu: "128"
    limits.memory: "256Gi"
To modify the default platform resource limits, update the admin config and namespace-level quota in your Helm values.

Resource configuration patterns

For tasks that load large datasets into memory, set generous memory requests and limits while keeping CPU modest:
@task(
    requests=Resources(cpu="2", mem="16Gi"),
    limits=Resources(cpu="4", mem="32Gi"),
)
def process_large_dataset(path: str) -> int:
    import pandas as pd
    df = pd.read_parquet(path)
    return len(df)
For tasks that parallelize work across CPU cores, match the CPU request to the number of workers:
@task(
    requests=Resources(cpu="8", mem="4Gi"),
    limits=Resources(cpu="16", mem="8Gi"),
)
def parallel_computation(data: list) -> list:
    from multiprocessing import Pool
    with Pool(processes=8) as pool:
        return pool.map(process_item, data)
For GPU tasks, request and limit GPU count explicitly. Request ample CPU and memory to feed the GPU:
@task(
    requests=Resources(cpu="8", mem="32Gi", gpu="1"),
    limits=Resources(cpu="16", mem="64Gi", gpu="1"),
)
def fine_tune_model(base_model: str, dataset: str) -> str:
    # Training loop here
    return "/output/model"

Build docs developers (and LLMs) love