Skip to main content

Overview

Scoring is the most critical stage of the pipeline. The Phoenix scorer uses a Grok-based transformer model to predict engagement probabilities for each candidate, and these predictions are combined using a weighted formula to produce the final relevance score.
Unlike traditional recommendation systems that rely on hand-engineered features, the Phoenix model learns relevance entirely from user engagement sequences. No manual feature engineering required.

Scoring Pipeline

Four scorers apply sequentially to transform raw candidates into ranked results:
Candidates


┌─────────────────────┐
│  Phoenix Scorer     │  ML predictions: P(like), P(reply), etc.
└──────────┬──────────┘


┌─────────────────────┐
│  Weighted Scorer    │  Combine predictions: Σ(weight × P(action))
└──────────┬──────────┘


┌─────────────────────┐
│  Author Diversity   │  Attenuate repeated authors
│  Scorer             │
└──────────┬──────────┘


┌─────────────────────┐
│  OON Scorer         │  Adjust out-of-network scores
└──────────┬──────────┘


    Final Scores

Phoenix Scorer

Model Architecture

The Phoenix scorer uses a Grok-based transformer with candidate isolation - candidates cannot attend to each other during inference. This ensures scores are consistent regardless of which other posts are in the batch.

Phoenix Model Details

See the Phoenix Ranking page for complete transformer architecture and attention masking details.

Input Format

The model takes:
  • User Context: User embedding + engagement history sequence
  • Candidate Posts: Post embeddings (text hashes, author IDs, metadata)

Output: Multi-Action Predictions

The model predicts probabilities for 14 engagement actions:
These actions indicate the user finds content valuable:
  • favorite_score - Probability user will like the post
  • reply_score - Probability user will reply
  • retweet_score - Probability user will repost
  • quote_score - Probability user will quote repost
  • click_score - Probability user will click into post detail
  • profile_click_score - Probability user will click author profile
  • vqv_score - Probability of video quality view (watching video)
  • photo_expand_score - Probability user will expand photo
  • share_score - Probability user will share externally
  • share_via_dm_score - Probability user will DM the post
  • share_via_copy_link_score - Probability user will copy link
  • dwell_score - Probability user will dwell (spend time viewing)
  • follow_author_score - Probability user will follow the author
These actions indicate the user dislikes content:
  • not_interested_score - Probability user will click “Not Interested”
  • block_author_score - Probability user will block the author
  • mute_author_score - Probability user will mute the author
  • report_score - Probability user will report the post
  • dwell_time - Expected dwell time in seconds (continuous value)

Implementation

The Phoenix scorer extracts predictions from the model response:
home-mixer/scorers/phoenix_scorer.rs
fn extract_phoenix_scores(&self, p: &ActionPredictions) -> PhoenixScores {
    PhoenixScores {
        favorite_score: p.get(ActionName::ServerTweetFav),
        reply_score: p.get(ActionName::ServerTweetReply),
        retweet_score: p.get(ActionName::ServerTweetRetweet),
        photo_expand_score: p.get(ActionName::ClientTweetPhotoExpand),
        click_score: p.get(ActionName::ClientTweetClick),
        profile_click_score: p.get(ActionName::ClientTweetClickProfile),
        vqv_score: p.get(ActionName::ClientTweetVideoQualityView),
        share_score: p.get(ActionName::ClientTweetShare),
        share_via_dm_score: p.get(ActionName::ClientTweetClickSendViaDirectMessage),
        share_via_copy_link_score: p.get(ActionName::ClientTweetShareViaCopyLink),
        dwell_score: p.get(ActionName::ClientTweetRecapDwelled),
        quote_score: p.get(ActionName::ServerTweetQuote),
        quoted_click_score: p.get(ActionName::ClientQuotedTweetClick),
        follow_author_score: p.get(ActionName::ClientTweetFollowAuthor),
        not_interested_score: p.get(ActionName::ClientTweetNotInterestedIn),
        block_author_score: p.get(ActionName::ClientTweetBlockAuthor),
        mute_author_score: p.get(ActionName::ClientTweetMuteAuthor),
        report_score: p.get(ActionName::ClientTweetReport),
        dwell_time: p.get_continuous(ContinuousActionName::DwellTime),
    }
}

Weighted Scorer

Formula

The weighted scorer combines all Phoenix predictions into a single relevance score:
Weighted Score = (w_fav × P(favorite))
               + (w_reply × P(reply))
               + (w_retweet × P(retweet))
               + (w_photo_expand × P(photo_expand))
               + (w_click × P(click))
               + (w_profile_click × P(profile_click))
               + (w_vqv × P(vqv))  [if video duration > threshold]
               + (w_share × P(share))
               + (w_share_dm × P(share_via_dm))
               + (w_share_copy × P(share_via_copy_link))
               + (w_dwell × P(dwell))
               + (w_quote × P(quote))
               + (w_quoted_click × P(quoted_click))
               + (w_dwell_time × dwell_time)
               + (w_follow × P(follow_author))
               + (w_not_interested × P(not_interested))  [negative weight]
               + (w_block × P(block_author))            [negative weight]
               + (w_mute × P(mute_author))              [negative weight]
               + (w_report × P(report))                 [negative weight]
Negative engagement predictions (block, mute, report, not interested) have negative weights, which reduces the score for content users would likely dislike.

Weight Configuration

Weights are tuned based on business objectives and user satisfaction metrics. Higher weights are given to:
  • Meaningful engagement: Replies, reposts, follows (indicate high-quality content)
  • Consumption: Dwell time, video views (indicate engaging content)
Lower or negative weights for:
  • Passive signals: Simple clicks (less meaningful than likes)
  • Negative signals: Blocks, mutes, reports (strongly negative)

Video Quality View (VQV) Eligibility

The VQV weight only applies to videos longer than a minimum duration:
home-mixer/scorers/weighted_scorer.rs
fn vqv_weight_eligibility(candidate: &PostCandidate) -> f64 {
    if candidate
        .video_duration_ms
        .is_some_and(|ms| ms > p::MIN_VIDEO_DURATION_MS)
    {
        p::VQV_WEIGHT
    } else {
        0.0
    }
}
This prevents short video loops from getting artificially high scores.

Score Normalization

After computing the weighted sum, the score is normalized using an offset function:
home-mixer/scorers/weighted_scorer.rs
fn offset_score(combined_score: f64) -> f64 {
    if p::WEIGHTS_SUM == 0.0 {
        combined_score.max(0.0)
    } else if combined_score < 0.0 {
        (combined_score + p::NEGATIVE_WEIGHTS_SUM) / p::WEIGHTS_SUM * p::NEGATIVE_SCORES_OFFSET
    } else {
        combined_score + p::NEGATIVE_SCORES_OFFSET
    }
}
This ensures:
  • Negative scores are scaled appropriately
  • All scores are shifted to a consistent range
  • Zero or near-zero scores don’t dominate rankings

Author Diversity Scorer

Purpose

Ensures the feed doesn’t show too many posts from the same author consecutively, even if that author’s content scores very high.

Algorithm

  1. Sort all candidates by weighted score (descending)
  2. Track how many times each author has appeared
  3. Apply an exponential decay multiplier:
multiplier = (1 - floor) × decay^position + floor
Where:
  • decay - Decay factor (typically 0.5-0.8)
  • position - How many posts from this author have already been seen (0-indexed)
  • floor - Minimum multiplier (prevents complete elimination)

Example

If decay = 0.6 and floor = 0.2:
Post from AuthorPositionMultiplierEffect
First post01.0Full score
Second post10.6868% of score
Third post20.48849% of score
Fourth post30.37337% of score

Implementation

home-mixer/scorers/author_diversity_scorer.rs
fn multiplier(&self, position: usize) -> f64 {
    (1.0 - self.floor) * self.decay_factor.powf(position as f64) + self.floor
}

// In scoring loop:
let entry = author_counts.entry(candidate.author_id).or_insert(0);
let position = *entry;
*entry += 1;

let multiplier = self.multiplier(position);
let adjusted_score = candidate.weighted_score.map(|score| score * multiplier);

OON Scorer

Purpose

Adjusts scores for out-of-network content to balance in-network and out-of-network posts in the feed.

Formula

final_score = {
  base_score                       if in_network = true
  base_score × OON_WEIGHT_FACTOR   if in_network = false
}
The OON_WEIGHT_FACTOR (typically 0.75-1.0) controls how much out-of-network content appears:
  • < 1.0: Prioritizes in-network content
  • = 1.0: No preference between in-network and out-of-network
  • > 1.0: Prioritizes out-of-network content (discovery mode)

Implementation

home-mixer/scorers/oon_scorer.rs
let updated_score = c.score.map(|base_score| match c.in_network {
    Some(false) => base_score * p::OON_WEIGHT_FACTOR,
    _ => base_score,
});

Final Score Range

After all scorers apply, final scores typically range:
  • High relevance: 5.0 - 15.0
  • Medium relevance: 1.0 - 5.0
  • Low relevance: 0.1 - 1.0
  • Very low/negative: < 0.1
Scores below 0 indicate content the user would likely dislike (high predicted probability of block/mute/report).

Performance Optimization

Batch Inference

The Phoenix scorer batches all candidates into a single model inference request, reducing latency from O(n) to O(1) requests.

Score Caching

Scores are cached with:
  • prediction_request_id - Unique ID for this scoring request
  • last_scored_at_ms - Timestamp for cache invalidation
This enables reuse of scores within the same session without re-running inference.

Candidate Isolation

Because candidates cannot attend to each other in the transformer, scores are deterministic and independent of batch composition. This enables:
  • Parallel scoring of different candidate batches
  • Score caching across requests
  • A/B testing without score contamination

Phoenix Ranking

Deep dive into the Grok-based transformer architecture

Pipeline Stages

Overview of all pipeline stages

Filtering

Pre-scoring and post-selection filters

Phoenix Retrieval

Two-tower model for candidate sourcing

Build docs developers (and LLMs) love