Architecture Diagram
Overview
BuildBuddy implements the Build Event Protocol (BEP) to receive streaming build events from Bazel. These events contain information about build progress, targets, actions, test results, and more. The system is designed to handle high-throughput event streams while maintaining data consistency.Components Involved
Bazel Build Tool
The source of build events:- Generates build events during build execution
- Streams events via gRPC to BuildBuddy
- Includes build metadata, target info, and artifacts
- Sends final BuildFinished event on completion
Build Event Protocol Service
Handles incoming event streams:- Receives gRPC stream of build events
- Validates event format and authentication
- Maintains stream state and ordering
- Returns acknowledgments to Bazel
Event Handler
Processes individual events:- Parses event proto messages
- Extracts metadata and references
- Routes to appropriate storage services
- Handles different event types (target, test, artifact, etc.)
Invocation Service
Manages invocation lifecycle:- Creates new invocation records
- Updates invocation status as events arrive
- Tracks progress and completion
- Computes aggregate metrics (duration, cache hit rate)
Database Writer
Persists structured data:- Writes invocation metadata
- Stores target and action records
- Indexes test results
- Maintains relational integrity
Artifact Storage
Handles large binary data:- Receives uploaded build logs
- Stores test outputs and artifacts
- Implements deduplication via content addressing
- Supports compression and encryption
Webhook Service
Notifies external systems:- Triggers on invocation completion
- Sends notifications to configured endpoints
- Includes build status and metrics
- Supports retry and failure handling
Data Flow
Step 1: Stream Initialization
- Bazel initiates gRPC connection to BuildBuddy BEP service
- Sends initial metadata (API key, project ID)
- BuildBuddy authenticates and creates invocation record
- Returns stream acknowledgment to Bazel
- Invocation status set to “IN_PROGRESS”
Step 2: Event Streaming
- Bazel sends build events as they occur
- Events arrive in partial ordering (not strictly sequential)
- BuildBuddy buffers events and maintains state
- Each event is acknowledged after processing
- Events are processed in parallel where possible
Step 3: Event Processing
For each event:- Validation: Check event format and required fields
- Parsing: Extract metadata and references
- Type Handling: Route to specialized handlers:
- BuildStarted: Initialize invocation
- TargetConfigured/Completed: Store target info
- ActionExecuted: Record action details
- TestResult: Store test outcomes
- BuildFinished: Finalize invocation
- Storage: Persist data to database and blob storage
- Acknowledgment: Send response to Bazel
Step 4: Artifact Upload
For events with file references:- Event contains URI or inline data
- Large files uploaded separately via ByteStream API
- Content addressing ensures deduplication
- File metadata stored with action/test records
- Compression applied for storage optimization
Step 5: Invocation Finalization
- Bazel sends BuildFinished event
- BuildBuddy computes final metrics:
- Total duration
- Cache hit rate
- Success/failure status
- Action count and timing
- Updates invocation status to “COMPLETE”
- Triggers webhooks and notifications
- Closes gRPC stream
Step 6: Post-Processing
- Generate summary statistics
- Update organizational metrics
- Trigger analysis pipelines (flaky test detection, etc.)
- Send completion notifications
Event Types
BuildBuddy handles various event types:Core Events
- BuildStarted: Initialization and configuration
- BuildFinished: Final status and metrics
- Progress: Periodic status updates
Target Events
- TargetConfigured: Target configuration details
- TargetComplete: Target build completion
- TargetSummary: Aggregate target information
Action Events
- ActionExecuted: Individual action execution details
- CommandLine: Build command information
- OptionsParsed: Build options and flags
Test Events
- TestResult: Test execution outcome
- TestSummary: Aggregate test results
Artifact Events
- NamedSetOfFiles: File groups and references
- BuildToolLogs: Build tool output
- OutputFile: Individual output file references
Reliability Features
Stream Resilience
- Reconnection: Support for stream reconnection on network issues
- Event Deduplication: Handle duplicate events gracefully
- Out-of-Order Events: Buffer and reorder events when needed
- Partial Streams: Mark invocations as incomplete on stream failure
Data Consistency
- Transactional Writes: Use database transactions for related data
- Idempotent Processing: Safe to reprocess events
- Referential Integrity: Maintain relationships between records
Error Handling
- Validation Errors: Log and skip malformed events
- Storage Failures: Retry with exponential backoff
- Stream Errors: Return appropriate gRPC status codes
Performance Optimizations
Batching
- Group database writes for efficiency
- Batch artifact uploads
- Aggregate metrics computation
Parallelization
- Process independent events concurrently
- Parallel database inserts for different tables
- Asynchronous artifact storage
Buffering
- Event buffers reduce database write frequency
- Stream buffers handle network latency
- Write-through caches for frequently accessed data
Configuration
Key settings for build event ingestion:Monitoring
Key metrics to monitor:- Event ingestion rate (events/second)
- Stream duration (time to process stream)
- Database write latency
- Artifact upload duration
- Stream failure rate
- Event processing errors