Flyte agents overview

Flyte agents are long-running, stateless services that receive execution requests via gRPC and initiate jobs with external or internal services. They solve a fundamental problem: integrating with hosted platforms (such as Databricks or Snowflake) without consuming a Kubernetes pod for the entire duration of the job.

What are agents?

Without agents, a Flyte task that submits a Databricks job and waits for it to finish would occupy a running Kubernetes pod for hours — idle, but consuming resources and cluster quota. Agents decouple job submission from job monitoring: the agent service submits the job, records the job ID, and then FlytePropeller polls the agent periodically to check status, freeing the pod between polls. There are two types of agents:

Async agents

For long-running jobs on external platforms. The agent exposes create, get, and delete operations. FlytePropeller calls create to submit the job, then polls get until the job completes or fails.Examples: BigQuery, Databricks, Snowflake, SageMaker, Airflow

Sync agents

For request/response services that return results immediately. The agent exposes a single do operation and blocks until the result is available.Examples: OpenAI ChatGPT, internal API calls, data retrieval services

Architecture

Each agent service is a Kubernetes deployment. When a user triggers a task that is handled by an agent, FlytePropeller routes the gRPC request to the appropriate agent service. The agent service then communicates with the external platform.

┌─────────────────────────────────────────────────────┐
│                   Flyte Cluster                      │
│                                                      │
│  ┌─────────────────┐        ┌──────────────────────┐│
│  │  FlytePropeller  │──gRPC─▶│  Agent Service       ││
│  │                 │        │  (flyteagent pod)     ││
│  │  Polls agent    │◀──gRPC─│                      ││
│  │  for task status│        │  - BigQuery agent     ││
│  └─────────────────┘        │  - Databricks agent   ││
│                             │  - Custom agents      ││
│                             └──────────┬───────────┘│
└────────────────────────────────────────┼────────────┘
                                         │ API calls
                              ┌──────────▼────────────┐
                              │  External Services     │
                              │  BigQuery / Databricks │
                              │  Snowflake / OpenAI    │
                              └───────────────────────┘

You can deploy multiple agent services for different purposes — for example, a production agent service and a development agent service. FlytePropeller can be configured to route specific task types to specific agent services.

Available agents

The Flyte community maintains a growing set of agents for popular platforms:

Data warehouses and analytics

Agent	Task type	Description
BigQuery	`bigquery_query_job_task`	Run BigQuery jobs on Google Cloud
Snowflake	`snowflake_query_job`	Execute Snowflake SQL queries

Compute and ML platforms

Agent	Task type	Description
Databricks	Spark task config	Run Spark jobs on Databricks
AWS SageMaker	`sagemaker_training_job_task`	Train and deploy ML models on SageMaker
AWS Batch	`aws_batch_task`	Run batch jobs on AWS Batch

Workflow orchestration

Agent	Task type	Description
Airflow	Airflow operator tasks	Invoke Airflow operators from Flyte

AI and LLM services

Agent	Task type	Description
ChatGPT / OpenAI	`openai_chat_completion_task`	Call OpenAI chat completion API

For the full list of supported agents, see the Integrations documentation.

Key benefits

No pod overhead

Jobs running on external platforms don’t need a Kubernetes pod sitting idle for hours. Agents submit and poll without holding cluster resources.

Language-agnostic

Agents communicate via protobuf over gRPC, so they can be implemented in any language. The Python SDK provides AsyncAgentBase and SyncAgentBase for easy development.

Local testability

Agents can run in-process in a local Python environment. You can test agent tasks with pyflyte run without a full Flyte cluster.

Canary deployments

Agent services can be deployed independently. You can run a custom agent in a separate service without affecting the main production agent service.

Sensors

The agent framework also supports building custom sensors — tasks that poll an external condition and block until it is satisfied. A sensor extends BaseSensor and implements a poke method:

from flytekit.sensor.base_sensor import BaseSensor
import s3fs

class FileSensor(BaseSensor):
    def __init__(self):
        super().__init__(task_type="file_sensor")

    def poke(self, path: str) -> bool:
        fs = s3fs.S3FileSystem()
        return fs.exists(path)

This is useful for waiting on upstream data availability, external job completion, or any condition that can be polled via an API.

Next steps

Testing agents locally

Test existing or custom agents in a local Python environment without a Flyte cluster.

Developing agents

Build a custom agent to integrate Flyte with any external service.

Deploying agents

Deploy your agent to a Flyte cluster or sandbox environment.

Basics

Data Types & I/O

Advanced Composition

Productionizing

Flyte Agents

Flyte agents overview

What are agents?

Async agents

Sync agents

Architecture

Available agents

Key benefits

No pod overhead

Language-agnostic

Local testability

Canary deployments

Sensors

Next steps

Testing agents locally

Developing agents

Deploying agents

Build docs developers (and LLMs) love

Basics

Data Types & I/O

Advanced Composition

Productionizing

Flyte Agents

​What are agents?

Async agents

Sync agents

​Architecture

​Available agents

​Key benefits

No pod overhead

Language-agnostic

Local testability

Canary deployments

​Sensors

​Next steps

Testing agents locally

Developing agents

Deploying agents

Build docs developers (and LLMs) love

What are agents?

Architecture

Available agents

Key benefits

Sensors

Next steps