Ops, Jobs & Graphs
While assets are the recommended way to build data pipelines in Dagster, ops, jobs, and graphs provide lower-level primitives for task-based workflows. These are useful when you need fine-grained control over execution or are working with imperative, task-oriented logic.Overview
- Op: A single unit of computation (formerly called “solid”)
- Job: An executable set of ops with defined execution semantics
- Graph: A reusable composition of ops that can be converted to jobs
For most data pipelines, we recommend using assets instead of ops and jobs. Assets provide better observability, automatic lineage tracking, and a more intuitive mental model centered on data artifacts.
Ops
An op is a function that performs a discrete unit of work. Define ops using the@op decorator:
Op Configuration
Ops accept various configuration parameters:- Inputs & Outputs
- Configuration
- Resources
Op Context
Access runtime information using the context:Jobs
A job defines an executable graph of ops. Create jobs using the@job decorator:
@job, Dagster:
- Analyzes the function body to extract op invocations
- Builds a dependency graph based on data flow
- Creates an executable job definition
Job Configuration
Jobs accept configuration for resources, execution, and more:Executing Jobs
- Python API
- CLI
- UI
Graphs
A graph is a reusable composition of ops that can be converted into multiple jobs with different configurations:- Reuse the same computation logic across environments (dev, staging, prod)
- Test the same logic with different resource implementations
- Create multiple variants of a pipeline with different configurations
Nested Graphs
Graphs can contain other graphs, allowing you to build modular compositions:Op Dependencies
Dagster infers dependencies from the data flow between ops:Fan-out and Fan-in
Ops can have multiple outputs consumed by different downstream ops (fan-out) or combine outputs from multiple upstream ops (fan-in):Dynamic Execution
Dagster supports dynamic op generation at runtime usingDynamicOut:
Testing
Ops and jobs are easy to test:When to Use Ops vs Assets
Use Assets when...
Use Assets when...
- You’re building data pipelines with persistent outputs (tables, files, models)
- You want automatic lineage tracking and observability
- You need cross-job dependencies
- You’re focused on “what data exists” rather than “what tasks run”
Use Ops when...
Use Ops when...
- You’re building imperative workflows without persistent outputs
- You need fine-grained control over execution order
- You’re migrating from task-based orchestrators like Airflow
- Your computation is purely procedural (e.g., sending notifications, running commands)
Best Practices
Keep ops focused: Each op should do one thing well. Break complex operations into multiple ops that can be tested and reused independently.
Use type annotations: Add type hints to op inputs and outputs for better validation and documentation.
Leverage resources: Use resources for external services (databases, APIs) to enable testing with mocks.
Related Documentation
- Assets - Higher-level, data-centric approach
- Resources - Manage external dependencies
- IO Managers - Control data persistence between ops
- Schedules & Sensors - Trigger job execution
API Reference
@op- Op decorator@job- Job decorator@graph- Graph decoratorOpExecutionContext- Runtime context
