Overview
The Candidate Pipeline is a composable, trait-based framework for building recommendation systems in Rust. It provides a structured approach to fetching, enriching, filtering, scoring, and selecting candidates through a series of well-defined stages.Core Trait
TheCandidatePipeline trait is the main abstraction that defines the entire pipeline execution flow:
candidate-pipeline/candidate_pipeline.rs
Pipeline Stages
The pipeline executes in the following order:Query Hydration
All query hydrators run in parallel to enrich the query with additional data (user features, experiment flags, etc.)
Candidate Fetching
All sources run in parallel to retrieve candidate posts from different systems (Thunder, Phoenix, etc.)
Candidate Hydration
All hydrators run in parallel to enrich candidates with additional features (author info, video duration, etc.)
Pipeline Stages Enum
candidate-pipeline/candidate_pipeline.rs
Error Handling
The framework has robust error handling at each stage:Query Hydrators
Query Hydrators
If a query hydrator fails, the error is logged and the pipeline continues with the original query data.
Sources
Sources
If a source fails, the error is logged and the pipeline continues with candidates from other sources.
Hydrators
Hydrators
If a hydrator fails or returns mismatched length, the error is logged and candidates remain unmodified.
Filters
Filters
If a filter fails, the error is logged and the original candidate list is restored (no candidates are filtered).
Scorers
Scorers
If a scorer fails or returns mismatched length, the error is logged and candidates remain unscored.
Execution Flow
The main execution method orchestrates the entire pipeline:candidate-pipeline/candidate_pipeline.rs
Pipeline Result
The pipeline returns aPipelineResult containing:
candidate-pipeline/candidate_pipeline.rs
All candidates after hydration, before filtering
Candidates that were filtered out during the pipeline
Final ranked candidates returned to the user
The hydrated query object
HasRequestId Trait
All query types must implement theHasRequestId trait for logging and tracing:
candidate-pipeline/candidate_pipeline.rs
Parallel vs Sequential Execution
Parallel stages (query hydrators, sources, hydrators) use
join_all to run all components concurrently for maximum performance.Sequential stages (filters, scorers) run components one at a time, allowing each component to see the results of the previous one.Related Components
Sources
Learn about candidate sources (Thunder, Phoenix)
Hydrators
Enrich candidates and queries with features
Filters
Remove unwanted candidates from the pipeline
Scorers
Compute prediction scores for ranking