Skip to main content

Overview

The Scorer trait defines how candidates are assigned scores by machine learning models or ranking functions. Scorers run sequentially and populate score fields that are used for candidate selection and ranking.
Scorers MUST maintain the same candidate order and count. They cannot drop or reorder candidates - use Filter stages for removal.

Trait Definition

pub trait Scorer<Q, C>: Send + Sync
where
    Q: Clone + Send + Sync + 'static,
    C: Clone + Send + Sync + 'static,
{
    fn enable(&self, query: &Q) -> bool;
    async fn score(&self, query: &Q, candidates: &[C]) -> Result<Vec<C>, String>;
    fn update(&self, candidate: &mut C, scored: C);
    fn update_all(&self, candidates: &mut [C], scored: Vec<C>);
    fn name(&self) -> &'static str;
}

Type Parameters

Q
generic
The query type that contains request context and parametersConstraints: Clone + Send + Sync + 'static
C
generic
The candidate type that will be scoredConstraints: Clone + Send + Sync + 'static

Methods

enable

enable
fn
Determines whether this scorer should run for the given query
fn enable(&self, query: &Q) -> bool
query
&Q
Reference to the query object
return
bool
Returns true if this scorer should run, false to skip. Default implementation returns true.

score

score
async fn
Performs scoring operations and returns candidates with populated score fields
async fn score(&self, query: &Q, candidates: &[C]) -> Result<Vec<C>, String>
query
&Q
Reference to the query object
candidates
&[C]
Slice of candidates to score
return
Result<Vec<C>, String>
Returns a vector with the same length and order as input, with only this scorer’s fields populated. Returns an error message on failure.
The returned vector MUST have the same candidates in the same order as the input. Dropping or reordering candidates is not allowed.

update

update
fn
Copies score fields from a scored candidate into the target candidate
fn update(&self, candidate: &mut C, scored: C)
candidate
&mut C
Mutable reference to the candidate to update
scored
C
The scored candidate containing new score values
Only copy the fields that this scorer is responsible for. Leave other fields unchanged.

update_all

update_all
fn
Updates all candidates with their corresponding scored fields
fn update_all(&self, candidates: &mut [C], scored: Vec<C>)
candidates
&mut [C]
Mutable slice of candidates to update
scored
Vec<C>
Vector of scored candidates from the score method
The default implementation iterates and calls update for each pair. Override for custom batching logic.

name

name
fn
Returns a stable name for logging and metrics
fn name(&self) -> &'static str
return
&'static str
A short type name derived from the implementing struct

Example Implementation

Here’s a real example that scores tweets using the Phoenix ML prediction service:
use xai_candidate_pipeline::scorer::Scorer;
use tonic::async_trait;

pub struct PhoenixScorer {
    pub phoenix_client: Arc<dyn PhoenixPredictionClient + Send + Sync>,
}

#[async_trait]
impl Scorer<ScoredPostsQuery, PostCandidate> for PhoenixScorer {
    async fn score(
        &self,
        query: &ScoredPostsQuery,
        candidates: &[PostCandidate],
    ) -> Result<Vec<PostCandidate>, String> {
        let user_id = query.user_id as u64;
        let prediction_request_id = generate_request_id();

        let sequence = query.user_action_sequence.as_ref();
        
        // Build prediction request from candidates
        let tweet_infos: Vec<TweetInfo> = candidates
            .iter()
            .map(|c| TweetInfo {
                tweet_id: c.tweet_id as u64,
                author_id: c.author_id,
                ..Default::default()
            })
            .collect();

        // Get predictions from Phoenix ML model
        let response = self
            .phoenix_client
            .predict(user_id, sequence.clone(), tweet_infos)
            .await
            .map_err(|e| e.to_string())?;

        // Build predictions map
        let predictions_map = self.build_predictions_map(&response);

        // Create scored candidates with only the score fields
        let scored_candidates = candidates
            .iter()
            .map(|c| {
                let phoenix_scores = predictions_map
                    .get(&(c.tweet_id as u64))
                    .map(|preds| self.extract_scores(preds))
                    .unwrap_or_default();

                PostCandidate {
                    phoenix_scores,
                    prediction_request_id: Some(prediction_request_id),
                    ..Default::default()
                }
            })
            .collect();

        Ok(scored_candidates)
    }

    fn update(&self, candidate: &mut PostCandidate, scored: PostCandidate) {
        // Only update the fields this scorer is responsible for
        candidate.phoenix_scores = scored.phoenix_scores;
        candidate.prediction_request_id = scored.prediction_request_id;
    }
}

Usage Notes

  • Scorers run sequentially in the order they are configured
  • Each scorer can update different score fields on the candidate
  • Batch scoring operations for efficiency (send all candidates in one request)
  • Only populate score fields relevant to this scorer in the returned candidates
  • Use Default::default() for other fields to avoid overwriting data
  • The pipeline merges results by calling update for each scorer

Common Scorer Types

Machine Learning Scorers

  • Neural network model predictions
  • Engagement probability scores (likes, retweets, etc.)
  • Multi-task prediction outputs

Rule-Based Scorers

  • Recency scoring
  • Diversity penalties
  • Social graph affinity scores

Composite Scorers

  • Weighted combinations of multiple signals
  • A/B test score variants
  • Context-dependent scoring functions

Best Practices

  1. Maintain Order: Always return candidates in the same order as received
  2. Batch Operations: Send all candidates to the model in a single request
  3. Error Handling: Handle model failures gracefully, consider fallback scores
  4. Field Isolation: Only update fields owned by this scorer in the update method
  5. Logging: Include request IDs for tracing predictions through the system
  6. Performance: Monitor scoring latency; it’s often on the critical path

Performance Considerations

  • Scorers run sequentially and often involve expensive ML inference
  • Minimize the number of scorers on the critical path
  • Consider caching scores for recently-scored candidates
  • Use batch prediction APIs to score all candidates together
  • Monitor P99 latency as scoring can be variable

Difference from Hydrator

AspectHydratorScorer
ExecutionParallelSequential
PurposeEnrich with dataAssign scores
Typical operationsFetch from servicesML inference, ranking
PerformanceMultiple run concurrentlyEach adds latency

See Also

Build docs developers (and LLMs) love