Installation
Quick Start
Here’s a complete word count example:Core Concepts
Pipeline
Create and execute pipelines:PCollection
PCollections represent distributed datasets:Scopes
Scopes organize and name pipeline components:Transforms
Apply transforms to process data:DoFns and Functions
Simple Functions
Use regular Go functions for transformations:Structural DoFns
Use struct-based DoFns for complex logic:Side Inputs
Access additional data during processing:Go-Specific Features
Type Safety with Generics
The Go SDK leverages Go’s type system:Registration System
Register functions for proper serialization:Combiners
Implement efficient aggregations:I/O Connectors
Text Files
Avro Files
Cross-Language I/O
Use I/O transforms from other SDKs:Running Pipelines
Direct Runner (Local)
Dataflow Runner
Flink Runner
Prism (Portable Local Runner)
Best Practices
Always Register Functions and Types
Always Register Functions and Types
Use the
init() function to register all custom types and functions:Use Named Scopes for Clarity
Use Named Scopes for Clarity
Organize your pipeline with descriptive scope names:
Leverage Built-in Combiners
Leverage Built-in Combiners
Use the stats package for common aggregations:
Handle Errors Properly
Handle Errors Properly
Check errors from pipeline execution:
Composite Transforms
Create reusable pipeline components:Testing
Test your pipeline components:Building Container Images
For portable runners, build SDK harness containers:Resources
Go SDK Reference
Complete Go package documentation
Code Examples
Sample pipelines and patterns
Prism Runner
Local portable runner for testing
Build Guide
Building and testing Go SDK
Next Steps
- Explore runners for executing pipelines
- Check out windowing for stream processing
- Learn about transforms for data processing
- Browse examples for common patterns