What are agents?
Without agents, a Flyte task that submits a Databricks job and waits for it to finish would occupy a running Kubernetes pod for hours — idle, but consuming resources and cluster quota. Agents decouple job submission from job monitoring: the agent service submits the job, records the job ID, and then FlytePropeller polls the agent periodically to check status, freeing the pod between polls. There are two types of agents:Async agents
For long-running jobs on external platforms. The agent exposes
create, get, and delete operations. FlytePropeller calls create to submit the job, then polls get until the job completes or fails.Examples: BigQuery, Databricks, Snowflake, SageMaker, AirflowSync agents
For request/response services that return results immediately. The agent exposes a single
do operation and blocks until the result is available.Examples: OpenAI ChatGPT, internal API calls, data retrieval servicesArchitecture
Each agent service is a Kubernetes deployment. When a user triggers a task that is handled by an agent, FlytePropeller routes the gRPC request to the appropriate agent service. The agent service then communicates with the external platform.Available agents
The Flyte community maintains a growing set of agents for popular platforms:Data warehouses and analytics
Data warehouses and analytics
| Agent | Task type | Description |
|---|---|---|
| BigQuery | bigquery_query_job_task | Run BigQuery jobs on Google Cloud |
| Snowflake | snowflake_query_job | Execute Snowflake SQL queries |
Compute and ML platforms
Compute and ML platforms
| Agent | Task type | Description |
|---|---|---|
| Databricks | Spark task config | Run Spark jobs on Databricks |
| AWS SageMaker | sagemaker_training_job_task | Train and deploy ML models on SageMaker |
| AWS Batch | aws_batch_task | Run batch jobs on AWS Batch |
Workflow orchestration
Workflow orchestration
| Agent | Task type | Description |
|---|---|---|
| Airflow | Airflow operator tasks | Invoke Airflow operators from Flyte |
AI and LLM services
AI and LLM services
| Agent | Task type | Description |
|---|---|---|
| ChatGPT / OpenAI | openai_chat_completion_task | Call OpenAI chat completion API |
Key benefits
No pod overhead
Jobs running on external platforms don’t need a Kubernetes pod sitting idle for hours. Agents submit and poll without holding cluster resources.
Language-agnostic
Agents communicate via protobuf over gRPC, so they can be implemented in any language. The Python SDK provides
AsyncAgentBase and SyncAgentBase for easy development.Local testability
Agents can run in-process in a local Python environment. You can test agent tasks with
pyflyte run without a full Flyte cluster.Canary deployments
Agent services can be deployed independently. You can run a custom agent in a separate service without affecting the main production agent service.
Sensors
The agent framework also supports building custom sensors — tasks that poll an external condition and block until it is satisfied. A sensor extendsBaseSensor and implements a poke method:
Next steps
Testing agents locally
Test existing or custom agents in a local Python environment without a Flyte cluster.
Developing agents
Build a custom agent to integrate Flyte with any external service.
Deploying agents
Deploy your agent to a Flyte cluster or sandbox environment.