Skip to main content
Flyte agents are long-running, stateless services that receive execution requests via gRPC and initiate jobs with external or internal services. They solve a fundamental problem: integrating with hosted platforms (such as Databricks or Snowflake) without consuming a Kubernetes pod for the entire duration of the job.

What are agents?

Without agents, a Flyte task that submits a Databricks job and waits for it to finish would occupy a running Kubernetes pod for hours — idle, but consuming resources and cluster quota. Agents decouple job submission from job monitoring: the agent service submits the job, records the job ID, and then FlytePropeller polls the agent periodically to check status, freeing the pod between polls. There are two types of agents:

Async agents

For long-running jobs on external platforms. The agent exposes create, get, and delete operations. FlytePropeller calls create to submit the job, then polls get until the job completes or fails.Examples: BigQuery, Databricks, Snowflake, SageMaker, Airflow

Sync agents

For request/response services that return results immediately. The agent exposes a single do operation and blocks until the result is available.Examples: OpenAI ChatGPT, internal API calls, data retrieval services

Architecture

Each agent service is a Kubernetes deployment. When a user triggers a task that is handled by an agent, FlytePropeller routes the gRPC request to the appropriate agent service. The agent service then communicates with the external platform.
┌─────────────────────────────────────────────────────┐
│                   Flyte Cluster                      │
│                                                      │
│  ┌─────────────────┐        ┌──────────────────────┐│
│  │  FlytePropeller  │──gRPC─▶│  Agent Service       ││
│  │                 │        │  (flyteagent pod)     ││
│  │  Polls agent    │◀──gRPC─│                      ││
│  │  for task status│        │  - BigQuery agent     ││
│  └─────────────────┘        │  - Databricks agent   ││
│                             │  - Custom agents      ││
│                             └──────────┬───────────┘│
└────────────────────────────────────────┼────────────┘
                                         │ API calls
                              ┌──────────▼────────────┐
                              │  External Services     │
                              │  BigQuery / Databricks │
                              │  Snowflake / OpenAI    │
                              └───────────────────────┘
You can deploy multiple agent services for different purposes — for example, a production agent service and a development agent service. FlytePropeller can be configured to route specific task types to specific agent services.

Available agents

The Flyte community maintains a growing set of agents for popular platforms:
AgentTask typeDescription
BigQuerybigquery_query_job_taskRun BigQuery jobs on Google Cloud
Snowflakesnowflake_query_jobExecute Snowflake SQL queries
AgentTask typeDescription
DatabricksSpark task configRun Spark jobs on Databricks
AWS SageMakersagemaker_training_job_taskTrain and deploy ML models on SageMaker
AWS Batchaws_batch_taskRun batch jobs on AWS Batch
AgentTask typeDescription
AirflowAirflow operator tasksInvoke Airflow operators from Flyte
AgentTask typeDescription
ChatGPT / OpenAIopenai_chat_completion_taskCall OpenAI chat completion API
For the full list of supported agents, see the Integrations documentation.

Key benefits

No pod overhead

Jobs running on external platforms don’t need a Kubernetes pod sitting idle for hours. Agents submit and poll without holding cluster resources.

Language-agnostic

Agents communicate via protobuf over gRPC, so they can be implemented in any language. The Python SDK provides AsyncAgentBase and SyncAgentBase for easy development.

Local testability

Agents can run in-process in a local Python environment. You can test agent tasks with pyflyte run without a full Flyte cluster.

Canary deployments

Agent services can be deployed independently. You can run a custom agent in a separate service without affecting the main production agent service.

Sensors

The agent framework also supports building custom sensors — tasks that poll an external condition and block until it is satisfied. A sensor extends BaseSensor and implements a poke method:
from flytekit.sensor.base_sensor import BaseSensor
import s3fs

class FileSensor(BaseSensor):
    def __init__(self):
        super().__init__(task_type="file_sensor")

    def poke(self, path: str) -> bool:
        fs = s3fs.S3FileSystem()
        return fs.exists(path)
This is useful for waiting on upstream data availability, external job completion, or any condition that can be polled via an API.

Next steps

Testing agents locally

Test existing or custom agents in a local Python environment without a Flyte cluster.

Developing agents

Build a custom agent to integrate Flyte with any external service.

Deploying agents

Deploy your agent to a Flyte cluster or sandbox environment.

Build docs developers (and LLMs) love