Skip to main content

Overview

The Selector trait defines how the final set of candidates is chosen from the scored pool. Selectors typically sort candidates by their scores and truncate to a desired size.

Trait Definition

pub trait Selector<Q, C>: Send + Sync
where
    Q: Clone + Send + Sync + 'static,
    C: Clone + Send + Sync + 'static,
{
    fn select(&self, query: &Q, candidates: Vec<C>) -> Vec<C>;
    fn enable(&self, query: &Q) -> bool;
    fn score(&self, candidate: &C) -> f64;
    fn sort(&self, candidates: Vec<C>) -> Vec<C>;
    fn size(&self) -> Option<usize>;
    fn name(&self) -> &'static str;
}

Type Parameters

Q
generic
The query type that contains request context and parametersConstraints: Clone + Send + Sync + 'static
C
generic
The candidate type to select fromConstraints: Clone + Send + Sync + 'static

Methods

select

select
fn
Main selection method that sorts and truncates candidates
fn select(&self, query: &Q, candidates: Vec<C>) -> Vec<C>
query
&Q
Reference to the query object
candidates
Vec<C>
Vector of candidates to select from (takes ownership)
return
Vec<C>
Selected and sorted candidates, truncated to size() if specified
The default implementation calls sort() and then truncates to size() if provided.

enable

enable
fn
Determines whether this selector should run for the given query
fn enable(&self, query: &Q) -> bool
query
&Q
Reference to the query object
return
bool
Returns true if this selector should run, false to skip. Default implementation returns true.

score

score
fn
required
Extracts the score to use for sorting from a candidate
fn score(&self, candidate: &C) -> f64
candidate
&C
Reference to a candidate
return
f64
The numeric score to use for ranking. Higher scores rank higher.
This method must be implemented by the selector. Use f64::NEG_INFINITY for candidates that should rank lowest.

sort

sort
fn
Sorts candidates by their scores in descending order
fn sort(&self, candidates: Vec<C>) -> Vec<C>
candidates
Vec<C>
Vector of candidates to sort
return
Vec<C>
Sorted candidates with highest scores first
The default implementation sorts by calling score() for each candidate and comparing in descending order.

size

size
fn
Optionally specifies how many candidates to select
fn size(&self) -> Option<usize>
return
Option<usize>
The number of top candidates to select, or None for no truncation. Default implementation returns None.

name

name
fn
Returns a stable name for logging and metrics
fn name(&self) -> &'static str
return
&'static str
A short type name derived from the implementing struct

Example Implementation

Here’s a real example that selects the top K candidates by score:
use xai_candidate_pipeline::selector::Selector;

pub struct TopKScoreSelector;

impl Selector<ScoredPostsQuery, PostCandidate> for TopKScoreSelector {
    fn score(&self, candidate: &PostCandidate) -> f64 {
        // Use the candidate's final score, or negative infinity if missing
        candidate.score.unwrap_or(f64::NEG_INFINITY)
    }
    
    fn size(&self) -> Option<usize> {
        // Return top 100 candidates
        Some(100)
    }
}
With the default implementations, this selector will:
  1. Sort all candidates by their score field in descending order
  2. Truncate to the top 100 candidates
  3. Return the selected candidates

Advanced Example

Here’s a more complex selector that implements custom selection logic:
pub struct DiversitySelector {
    pub target_size: usize,
    pub max_per_author: usize,
}

impl Selector<ScoredPostsQuery, PostCandidate> for DiversitySelector {
    fn select(&self, _query: &ScoredPostsQuery, candidates: Vec<PostCandidate>) -> Vec<PostCandidate> {
        let mut sorted = self.sort(candidates);
        let mut selected = Vec::new();
        let mut author_counts = HashMap::new();
        
        // Select candidates while enforcing diversity constraints
        for candidate in sorted {
            let author_count = author_counts.entry(candidate.author_id).or_insert(0);
            
            if *author_count < self.max_per_author && selected.len() < self.target_size {
                selected.push(candidate);
                *author_count += 1;
            }
        }
        
        selected
    }
    
    fn score(&self, candidate: &PostCandidate) -> f64 {
        candidate.score.unwrap_or(f64::NEG_INFINITY)
    }
    
    fn size(&self) -> Option<usize> {
        Some(self.target_size)
    }
}

Usage Notes

  • Selectors typically run once at the end of the pipeline
  • The select method can be overridden for complex selection logic
  • The default select implementation handles most simple cases (sort + truncate)
  • The sort method can be overridden for custom sorting algorithms
  • Use enable() to conditionally apply different selectors based on query parameters

Common Selector Types

Score-Based Selectors

  • Top-K by final score
  • Top-K by specific score field (e.g., engagement score)
  • Threshold-based selection

Diversity Selectors

  • Limit candidates per author
  • Ensure content type diversity
  • Balance different candidate sources

Context-Aware Selectors

  • Time-of-day dependent selection
  • User preference-based selection
  • A/B test variant selection

Best Practices

  1. Score Extraction: Implement score() to extract the appropriate score field
  2. Handle Missing Scores: Use f64::NEG_INFINITY for candidates without scores
  3. Efficient Sorting: The default sort() is efficient for most cases
  4. Overflow Safety: Use Vec::truncate() which handles cases where size > length
  5. Deterministic Ordering: Ensure stable sorting for reproducible results
  6. Override Selectively: Override select() only when you need custom logic beyond sort+truncate

Performance Considerations

  • Sorting is O(n log n) where n is the candidate count
  • Consider using partial sorting (like select_nth_unstable) for large candidate pools
  • Truncation after sorting is O(1) using Vec::truncate()
  • The selector is typically not on the critical path since it runs on already-reduced candidate sets

See Also

Build docs developers (and LLMs) love