Scoring and Ranking

Overview

Scoring is the most critical stage of the pipeline. The Phoenix scorer uses a Grok-based transformer model to predict engagement probabilities for each candidate, and these predictions are combined using a weighted formula to produce the final relevance score.

Unlike traditional recommendation systems that rely on hand-engineered features, the Phoenix model learns relevance entirely from user engagement sequences. No manual feature engineering required.

Scoring Pipeline

Four scorers apply sequentially to transform raw candidates into ranked results:

Candidates
    │
    ▼
┌─────────────────────┐
│  Phoenix Scorer     │  ML predictions: P(like), P(reply), etc.
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  Weighted Scorer    │  Combine predictions: Σ(weight × P(action))
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  Author Diversity   │  Attenuate repeated authors
│  Scorer             │
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  OON Scorer         │  Adjust out-of-network scores
└──────────┬──────────┘
           │
           ▼
    Final Scores

Phoenix Scorer

Model Architecture

The Phoenix scorer uses a Grok-based transformer with candidate isolation - candidates cannot attend to each other during inference. This ensures scores are consistent regardless of which other posts are in the batch.

Phoenix Model Details

See the Phoenix Ranking page for complete transformer architecture and attention masking details.

Input Format

The model takes:

User Context: User embedding + engagement history sequence
Candidate Posts: Post embeddings (text hashes, author IDs, metadata)

Output: Multi-Action Predictions

The model predicts probabilities for 14 engagement actions:

Positive Engagement Actions

These actions indicate the user finds content valuable:

favorite_score - Probability user will like the post
reply_score - Probability user will reply
retweet_score - Probability user will repost
quote_score - Probability user will quote repost
click_score - Probability user will click into post detail
profile_click_score - Probability user will click author profile
vqv_score - Probability of video quality view (watching video)
photo_expand_score - Probability user will expand photo
share_score - Probability user will share externally
share_via_dm_score - Probability user will DM the post
share_via_copy_link_score - Probability user will copy link
dwell_score - Probability user will dwell (spend time viewing)
follow_author_score - Probability user will follow the author

Negative Engagement Actions

These actions indicate the user dislikes content:

not_interested_score - Probability user will click “Not Interested”
block_author_score - Probability user will block the author
mute_author_score - Probability user will mute the author
report_score - Probability user will report the post

Continuous Predictions

dwell_time - Expected dwell time in seconds (continuous value)

Implementation

The Phoenix scorer extracts predictions from the model response:

home-mixer/scorers/phoenix_scorer.rs

fn extract_phoenix_scores(&self, p: &ActionPredictions) -> PhoenixScores {
    PhoenixScores {
        favorite_score: p.get(ActionName::ServerTweetFav),
        reply_score: p.get(ActionName::ServerTweetReply),
        retweet_score: p.get(ActionName::ServerTweetRetweet),
        photo_expand_score: p.get(ActionName::ClientTweetPhotoExpand),
        click_score: p.get(ActionName::ClientTweetClick),
        profile_click_score: p.get(ActionName::ClientTweetClickProfile),
        vqv_score: p.get(ActionName::ClientTweetVideoQualityView),
        share_score: p.get(ActionName::ClientTweetShare),
        share_via_dm_score: p.get(ActionName::ClientTweetClickSendViaDirectMessage),
        share_via_copy_link_score: p.get(ActionName::ClientTweetShareViaCopyLink),
        dwell_score: p.get(ActionName::ClientTweetRecapDwelled),
        quote_score: p.get(ActionName::ServerTweetQuote),
        quoted_click_score: p.get(ActionName::ClientQuotedTweetClick),
        follow_author_score: p.get(ActionName::ClientTweetFollowAuthor),
        not_interested_score: p.get(ActionName::ClientTweetNotInterestedIn),
        block_author_score: p.get(ActionName::ClientTweetBlockAuthor),
        mute_author_score: p.get(ActionName::ClientTweetMuteAuthor),
        report_score: p.get(ActionName::ClientTweetReport),
        dwell_time: p.get_continuous(ContinuousActionName::DwellTime),
    }
}

Weighted Scorer

Formula

The weighted scorer combines all Phoenix predictions into a single relevance score:

Weighted Score = (w_fav × P(favorite))
               + (w_reply × P(reply))
               + (w_retweet × P(retweet))
               + (w_photo_expand × P(photo_expand))
               + (w_click × P(click))
               + (w_profile_click × P(profile_click))
               + (w_vqv × P(vqv))  [if video duration > threshold]
               + (w_share × P(share))
               + (w_share_dm × P(share_via_dm))
               + (w_share_copy × P(share_via_copy_link))
               + (w_dwell × P(dwell))
               + (w_quote × P(quote))
               + (w_quoted_click × P(quoted_click))
               + (w_dwell_time × dwell_time)
               + (w_follow × P(follow_author))
               + (w_not_interested × P(not_interested))  [negative weight]
               + (w_block × P(block_author))            [negative weight]
               + (w_mute × P(mute_author))              [negative weight]
               + (w_report × P(report))                 [negative weight]

Negative engagement predictions (block, mute, report, not interested) have negative weights, which reduces the score for content users would likely dislike.

Weight Configuration

Weights are tuned based on business objectives and user satisfaction metrics. Higher weights are given to:

Meaningful engagement: Replies, reposts, follows (indicate high-quality content)
Consumption: Dwell time, video views (indicate engaging content)

Lower or negative weights for:

Passive signals: Simple clicks (less meaningful than likes)
Negative signals: Blocks, mutes, reports (strongly negative)

Video Quality View (VQV) Eligibility

The VQV weight only applies to videos longer than a minimum duration:

home-mixer/scorers/weighted_scorer.rs

fn vqv_weight_eligibility(candidate: &PostCandidate) -> f64 {
    if candidate
        .video_duration_ms
        .is_some_and(|ms| ms > p::MIN_VIDEO_DURATION_MS)
    {
        p::VQV_WEIGHT
    } else {
        0.0
    }
}

This prevents short video loops from getting artificially high scores.

Score Normalization

After computing the weighted sum, the score is normalized using an offset function:

home-mixer/scorers/weighted_scorer.rs

fn offset_score(combined_score: f64) -> f64 {
    if p::WEIGHTS_SUM == 0.0 {
        combined_score.max(0.0)
    } else if combined_score < 0.0 {
        (combined_score + p::NEGATIVE_WEIGHTS_SUM) / p::WEIGHTS_SUM * p::NEGATIVE_SCORES_OFFSET
    } else {
        combined_score + p::NEGATIVE_SCORES_OFFSET
    }
}

This ensures:

Negative scores are scaled appropriately
All scores are shifted to a consistent range
Zero or near-zero scores don’t dominate rankings

Author Diversity Scorer

Purpose

Ensures the feed doesn’t show too many posts from the same author consecutively, even if that author’s content scores very high.

Algorithm

Sort all candidates by weighted score (descending)
Track how many times each author has appeared
Apply an exponential decay multiplier:

multiplier = (1 - floor) × decay^position + floor

Where:

decay - Decay factor (typically 0.5-0.8)
position - How many posts from this author have already been seen (0-indexed)
floor - Minimum multiplier (prevents complete elimination)

Example

If decay = 0.6 and floor = 0.2:

Post from Author	Position	Multiplier	Effect
First post	0	1.0	Full score
Second post	1	0.68	68% of score
Third post	2	0.488	49% of score
Fourth post	3	0.373	37% of score

Implementation

home-mixer/scorers/author_diversity_scorer.rs

fn multiplier(&self, position: usize) -> f64 {
    (1.0 - self.floor) * self.decay_factor.powf(position as f64) + self.floor
}

// In scoring loop:
let entry = author_counts.entry(candidate.author_id).or_insert(0);
let position = *entry;
*entry += 1;

let multiplier = self.multiplier(position);
let adjusted_score = candidate.weighted_score.map(|score| score * multiplier);

OON Scorer

Purpose

Adjusts scores for out-of-network content to balance in-network and out-of-network posts in the feed.

Formula

final_score = {
  base_score                       if in_network = true
  base_score × OON_WEIGHT_FACTOR   if in_network = false
}

The OON_WEIGHT_FACTOR (typically 0.75-1.0) controls how much out-of-network content appears:

< 1.0: Prioritizes in-network content
= 1.0: No preference between in-network and out-of-network
> 1.0: Prioritizes out-of-network content (discovery mode)

Implementation

home-mixer/scorers/oon_scorer.rs

let updated_score = c.score.map(|base_score| match c.in_network {
    Some(false) => base_score * p::OON_WEIGHT_FACTOR,
    _ => base_score,
});

Final Score Range

After all scorers apply, final scores typically range:

High relevance: 5.0 - 15.0
Medium relevance: 1.0 - 5.0
Low relevance: 0.1 - 1.0
Very low/negative: < 0.1

Scores below 0 indicate content the user would likely dislike (high predicted probability of block/mute/report).

Performance Optimization

Batch Inference

The Phoenix scorer batches all candidates into a single model inference request, reducing latency from O(n) to O(1) requests.

Score Caching

Scores are cached with:

prediction_request_id - Unique ID for this scoring request
last_scored_at_ms - Timestamp for cache invalidation

This enables reuse of scores within the same session without re-running inference.

Candidate Isolation

Because candidates cannot attend to each other in the transformer, scores are deterministic and independent of batch composition. This enables:

Parallel scoring of different candidate batches
Score caching across requests
A/B testing without score contamination

Phoenix Ranking

Deep dive into the Grok-based transformer architecture

Pipeline Stages

Overview of all pipeline stages

Filtering

Pre-scoring and post-selection filters

Phoenix Retrieval

Two-tower model for candidate sourcing

Getting Started

Core Components

How It Works

Phoenix ML System

Implementation

Key Concepts

Overview

Scoring Pipeline

Phoenix Scorer

Model Architecture

Phoenix Model Details

Input Format

Output: Multi-Action Predictions

Implementation

Weighted Scorer

Formula

Weight Configuration

Video Quality View (VQV) Eligibility

Score Normalization

Author Diversity Scorer

Purpose

Algorithm

Example

Implementation

OON Scorer

Purpose

Formula

Implementation

Final Score Range

Performance Optimization

Batch Inference

Score Caching

Candidate Isolation

Phoenix Ranking

Pipeline Stages

Filtering

Phoenix Retrieval

Build docs developers (and LLMs) love

Getting Started

Core Components

How It Works

Phoenix ML System

Implementation

Key Concepts

​Overview

​Scoring Pipeline

​Phoenix Scorer

​Model Architecture

Phoenix Model Details

​Input Format

​Output: Multi-Action Predictions

​Implementation

​Weighted Scorer

​Formula

​Weight Configuration

​Video Quality View (VQV) Eligibility

​Score Normalization

​Author Diversity Scorer

​Purpose

​Algorithm

​Example

​Implementation

​OON Scorer

​Purpose

​Formula

​Implementation

​Final Score Range

​Performance Optimization

​Batch Inference

​Score Caching

​Candidate Isolation

​Related Pages

Phoenix Ranking

Pipeline Stages

Filtering

Phoenix Retrieval

Build docs developers (and LLMs) love

Overview

Scoring Pipeline

Phoenix Scorer

Model Architecture

Input Format

Output: Multi-Action Predictions

Implementation

Weighted Scorer

Formula

Weight Configuration

Video Quality View (VQV) Eligibility

Score Normalization

Author Diversity Scorer

Purpose

Algorithm

Example

Implementation

OON Scorer

Purpose

Formula

Implementation

Final Score Range

Performance Optimization

Batch Inference

Score Caching

Candidate Isolation

Related Pages