Skip to main content

Architectural Overview

The X For You Feed Algorithm is built on a microservices architecture with clear separation of concerns. Each component has a specific responsibility and communicates through well-defined interfaces.
The system emphasizes composability, parallelism, and graceful degradation to achieve both performance and reliability.

Component Diagram

┌─────────────────────────────────────────────────────────────────────────────────────────────┐
│                                    CLIENT APPLICATION                                       │
│                                    (Mobile/Web App)                                         │
└────────────────────────────────────────┬────────────────────────────────────────────────────┘
                                         │ gRPC Request
                                         │ ScoredPostsQuery

┌─────────────────────────────────────────────────────────────────────────────────────────────┐
│                                      HOME MIXER                                             │
│                                    [Rust + gRPC Server]                                     │
│                                                                                             │
│  ┌──────────────────────────────────────────────────────────────────────────────────────┐  │
│  │                           Phoenix Candidate Pipeline                                 │  │
│  │                         (Implements CandidatePipeline Trait)                         │  │
│  └──────────────────────────────────────────────────────────────────────────────────────┘  │
│                                         │                                                   │
│                    ┌────────────────────┼────────────────────┐                              │
│                    ▼                    ▼                    ▼                              │
│            ┌──────────────┐    ┌──────────────┐    ┌──────────────┐                        │
│            │   Sources    │    │  Hydrators   │    │   Filters    │                        │
│            ├──────────────┤    ├──────────────┤    ├──────────────┤                        │
│            │ Thunder      │    │ CoreData     │    │ Age          │                        │
│            │ Phoenix      │    │ Gizmoduck    │    │ Duplicates   │                        │
│            │              │    │ Video        │    │ Socialgraph  │                        │
│            │              │    │ Subscription │    │ Keywords     │                        │
│            └──────┬───────┘    └──────┬───────┘    └──────┬───────┘                        │
│                   │                   │                   │                                 │
│                   └───────────────────┴───────────────────┘                                 │
│                                       │                                                     │
│                                       ▼                                                     │
│                           ┌──────────────────────┐                                          │
│                           │      Scorers         │                                          │
│                           ├──────────────────────┤                                          │
│                           │ Phoenix Scorer       │                                          │
│                           │ Weighted Scorer      │                                          │
│                           │ Diversity Scorer     │                                          │
│                           └──────────┬───────────┘                                          │
│                                      │                                                      │
│                                      ▼                                                      │
│                           ┌──────────────────────┐                                          │
│                           │      Selector        │                                          │
│                           │  (Sort by Score)     │                                          │
│                           └──────────┬───────────┘                                          │
└───────────────────────────────────────┼──────────────────────────────────────────────────────┘
                                        │ ScoredPostsResponse

                             ┌─────────────────────┐
                             │  Ranked Feed        │
                             │  [Top 50 Posts]     │
                             └─────────────────────┘


┌────────────────────────────┐     ┌────────────────────────────┐     ┌────────────────────┐
│         THUNDER            │     │         PHOENIX            │     │   DATA STORES      │
│    [Rust Service]          │     │    [Python/JAX ML]         │     │                    │
│                            │     │                            │     │  • User Graphs     │
│  ┌──────────────────────┐  │     │  ┌──────────────────────┐  │     │  • Post Content    │
│  │  In-Memory Stores    │  │     │  │  Retrieval Model     │  │     │  • Media Assets    │
│  │  - Per-user posts    │  │     │  │  (Two-Tower)         │  │     │  • Engagement DB   │
│  │  - Indexed by author │  │     │  └──────────────────────┘  │     │                    │
│  └──────────────────────┘  │     │                            │     └────────────────────┘
│           ▲                │     │  ┌──────────────────────┐  │
│           │                │     │  │  Ranking Model       │  │
│  ┌────────┴─────────────┐  │     │  │  (Grok Transformer)  │  │
│  │  Kafka Consumer      │  │     │  └──────────────────────┘  │
│  │  - Post events       │  │     │                            │
│  │  - Delete events     │  │     └────────────────────────────┘
│  └──────────────────────┘  │
│           ▲                │
└───────────┼────────────────┘

   ┌────────┴─────────┐
   │  KAFKA TOPICS    │
   │  - post.create   │
   │  - post.delete   │
   └──────────────────┘

Core Design Patterns

1. Candidate Pipeline Pattern

The Candidate Pipeline is a composable framework that defines a standard way to build recommendation systems.
The framework is built on Rust traits that define behavior contracts:
// From candidate_pipeline.rs

#[async_trait]
pub trait Source<Q, C>: Send + Sync {
    fn name(&self) -> &str;
    fn enable(&self, query: &Q) -> bool;
    async fn get_candidates(&self, query: &Q) -> Result<Vec<C>>;
}

#[async_trait]
pub trait Hydrator<Q, C>: Send + Sync {
    fn name(&self) -> &str;
    fn enable(&self, query: &Q) -> bool;
    async fn hydrate(&self, query: &Q, candidates: &[C]) -> Result<Vec<H>>;
    fn update_all(&self, candidates: &mut [C], hydrated: Vec<H>);
}

#[async_trait]
pub trait Filter<Q, C>: Send + Sync {
    fn name(&self) -> &str;
    fn enable(&self, query: &Q) -> bool;
    async fn filter(&self, query: &Q, candidates: Vec<C>) 
        -> Result<FilterResult<C>>;
}

#[async_trait]
pub trait Scorer<Q, C>: Send + Sync {
    fn name(&self) -> &str;
    fn enable(&self, query: &Q) -> bool;
    async fn score(&self, query: &Q, candidates: &[C]) -> Result<Vec<S>>;
    fn update_all(&self, candidates: &mut [C], scores: Vec<S>);
}

2. Parallelism Strategy

The system maximizes throughput by parallelizing independent operations:
1

Query Hydrators: Parallel

Multiple hydrators fetch different aspects of user context simultaneously:
let hydrate_futures = hydrators.iter().map(|h| h.hydrate(&query));
let results = join_all(hydrate_futures).await; // All run concurrently
2

Sources: Parallel

Thunder and Phoenix retrieval run simultaneously:
let source_futures = sources.iter().map(|s| s.get_candidates(query));
let results = join_all(source_futures).await; // Thunder + Phoenix in parallel
3

Hydrators: Parallel

All candidate hydrators enrich data concurrently:
let hydrate_futures = hydrators.iter().map(|h| h.hydrate(query, &candidates));
let results = join_all(hydrate_futures).await; // All hydrators in parallel
4

Filters: Sequential

Filters must run in order since each depends on the previous filter’s output
5

Scorers: Sequential

Scorers run in order since later scorers may modify scores from earlier ones
Why Sequential for Filters/Scorers? Each filter removes candidates, affecting the input to the next filter. Each scorer may adjust scores computed by previous scorers. Dependencies require sequential execution.

3. Candidate Isolation in Phoenix

One of the most important architectural decisions is candidate isolation during ranking.
Problem: If candidates could attend to each other during transformer inference, the score for Post A would depend on which other posts happen to be in the batch. This makes scores inconsistent and uncacheable.
Solution: Custom Attention Masking
# From recsys_model.py - attention mask creation

def create_candidate_isolation_mask(
    batch_size: int,
    history_len: int, 
    num_candidates: int
) -> Array:
    """
    Creates attention mask where:
    - User + History can attend to each other (bidirectional)
    - Candidates can attend to User + History
    - Candidates CANNOT attend to other candidates (only self)
    """
    seq_len = 1 + history_len + num_candidates  # user + history + candidates
    mask = np.ones((seq_len, seq_len), dtype=np.float32)
    
    # Block candidates from attending to other candidates
    candidate_start = 1 + history_len
    for i in range(num_candidates):
        for j in range(num_candidates):
            if i != j:  # Allow self-attention
                pos_i = candidate_start + i
                pos_j = candidate_start + j
                mask[pos_i, pos_j] = 0.0  # Block attention
    
    return mask
Attention Pattern:
        Keys (attend TO) →
        ┌─────┬─────────────┬──────────────────────┐
        │ U   │  History    │  Candidates (C1-C4)  │
   ┌────┼─────┼─────────────┼──────────────────────┤
   │ U  │  ✓  │  ✓  ✓  ✓    │  ✗   ✗   ✗   ✗       │
 Q  ├────┼─────┼─────────────┼──────────────────────┤
 u  │ H1 │  ✓  │  ✓  ✓  ✓    │  ✗   ✗   ✗   ✗       │
 e  │ H2 │  ✓  │  ✓  ✓  ✓    │  ✗   ✗   ✗   ✗       │
 r  │ H3 │  ✓  │  ✓  ✓  ✓    │  ✗   ✗   ✗   ✗       │
 i  ├────┼─────┼─────────────┼──────────────────────┤
 e  │ C1 │  ✓  │  ✓  ✓  ✓    │  ✓   ✗   ✗   ✗       │ ← Can only self-attend
 s  │ C2 │  ✓  │  ✓  ✓  ✓    │  ✗   ✓   ✗   ✗       │
 ↓  │ C3 │  ✓  │  ✓  ✓  ✓    │  ✗   ✗   ✓   ✗       │
    │ C4 │  ✓  │  ✓  ✓  ✓    │  ✗   ✗   ✗   ✓       │
    └────┴─────┴─────────────┴──────────────────────┘
This isolation means we can pre-compute and cache candidate scores for specific user contexts, dramatically improving inference efficiency.

4. Hash-Based Embeddings

Both retrieval and ranking models use hash-based embeddings instead of traditional vocabulary lookup tables.
# From recsys_model.py - hash embedding

class HashConfig:
    num_user_hashes: int = 2    # Multiple hash functions per feature
    num_item_hashes: int = 2
    num_author_hashes: int = 2

def embed_with_hashing(feature_id: int, num_hashes: int, emb_size: int) -> Array:
    """
    Apply multiple hash functions and average the embeddings.
    This reduces collision effects and improves representation quality.
    """
    embeddings = []
    for hash_idx in range(num_hashes):
        # Different hash function per index
        hash_value = hash_function(feature_id, hash_idx) % HASH_TABLE_SIZE
        embedding = embedding_table[hash_value]  # Lookup
        embeddings.append(embedding)
    
    # Average across hash functions
    return np.mean(embeddings, axis=0)
Why Hashing?

Constant Memory

Embedding table size is fixed regardless of vocabulary size (users, posts, authors)

No OOV Problem

New users/posts/authors work immediately without retraining

Collision Mitigation

Multiple hash functions reduce collision impact through averaging

Sparse Features

Efficiently handles high-cardinality features like user IDs

5. Grok-Based Transformer

The Phoenix ranking model uses the Grok transformer architecture adapted for recommendations:
# From recsys_model.py - model configuration

recsys_model = PhoenixModelConfig(
    emb_size=128,
    num_actions=14,  # Multi-action prediction
    history_seq_len=32,
    candidate_seq_len=8,
    hash_config=HashConfig(
        num_user_hashes=2,
        num_item_hashes=2,
        num_author_hashes=2,
    ),
    product_surface_vocab_size=16,
    model=TransformerConfig(
        emb_size=128,
        widening_factor=2,
        key_size=64,
        num_q_heads=2,
        num_kv_heads=2,
        num_layers=2,
        attn_output_multiplier=0.125,
    ),
)
Key Adaptations from Grok:
Instead of text tokens, the model embeds:
  • User hash features
  • Post hash features
  • Author hash features
  • Action type (for history)
  • Product surface (mobile, web, etc.)
Rather than next-token prediction, outputs probabilities for 14 engagement actions:
# Output projection
logits = model(input_embeddings, attention_mask)
# Shape: [batch_size, num_candidates, num_actions]

probabilities = sigmoid(logits)
# Each candidate gets 14 probability scores
Custom attention mask ensures candidates can’t attend to each other (see section above)
Input sequence format:
[USER] [H1] [H2] ... [H_n] [C1] [C2] ... [C_m]
Where:
  • USER = User context embedding
  • H_i = History items (posts user engaged with)
  • C_i = Candidate items to rank

Data Flow Diagram

REQUEST FLOW:

1. Client Request

   ├─► [gRPC] ScoredPostsQuery
   │           - viewer_id
   │           - max_results
   │           - filters


2. Home Mixer

   ├─► Query Hydration (parallel)
   │   ├─► Fetch engagement history from DB
   │   ├─► Fetch following list from Graph Service
   │   └─► Fetch user preferences

   ├─► Candidate Sourcing (parallel)
   │   ├─► Thunder: Get in-network posts
   │   │   └─► In-memory lookup by author IDs
   │   │
   │   └─► Phoenix Retrieval: Get out-of-network posts
   │       ├─► Encode user context → embedding
   │       ├─► ANN search over candidate embeddings
   │       └─► Return top-K similar posts

   ├─► Candidate Hydration (parallel)
   │   ├─► CoreData: Post text, media, timestamps
   │   ├─► Gizmoduck: Author profiles
   │   ├─► Video: Durations and thumbnails
   │   └─► Subscription: Access eligibility

   ├─► Pre-Scoring Filters (sequential)
   │   ├─► Remove duplicates
   │   ├─► Remove old posts
   │   ├─► Remove blocked authors
   │   └─► Remove seen posts

   ├─► Scoring (sequential)
   │   ├─► Phoenix Scorer
   │   │   ├─► Batch candidates
   │   │   ├─► Call Phoenix ranking model
   │   │   └─► Get 14 action probabilities per candidate
   │   │
   │   ├─► Weighted Scorer
   │   │   └─► Combine probabilities into final score
   │   │
   │   └─► Diversity Scorer
   │       └─► Penalize repeated authors

   ├─► Selection
   │   └─► Sort by score, take top K

   ├─► Post-Selection Filters (sequential)
   │   ├─► Visibility filtering (spam, deleted, etc.)
   │   └─► Conversation deduplication

   └─► Response
       └─► [gRPC] ScoredPostsResponse
                   - ranked post IDs
                   - scores
                   - metadata

Deployment Architecture

┌─────────────────────────────────────────────────────┐
│                  LOAD BALANCER                      │
└────────────────────┬────────────────────────────────┘

     ┌───────────────┼───────────────┐
     │               │               │
     ▼               ▼               ▼
┌─────────┐    ┌─────────┐    ┌─────────┐
│  Home   │    │  Home   │    │  Home   │
│  Mixer  │    │  Mixer  │    │  Mixer  │
│ Instance│    │ Instance│    │ Instance│
└────┬────┘    └────┬────┘    └────┬────┘
     │              │              │
     └──────────────┼──────────────┘

     ┌──────────────┼──────────────┐
     │              │              │
     ▼              ▼              ▼
┌─────────┐    ┌─────────┐   ┌──────────┐
│ Thunder │    │ Phoenix │   │   Data   │
│ Service │    │   ML    │   │  Stores  │
└─────────┘    └─────────┘   └──────────┘

Technology Choices

Why Rust for Backend?

Performance

Zero-cost abstractions and no garbage collection pauses

Memory Safety

Compile-time guarantees prevent crashes and data races

Async Runtime

Tokio provides efficient async I/O for high concurrency

Type Safety

Strong typing catches bugs at compile time

Why JAX for ML?

Performance

JIT compilation and XLA backend for optimized execution

Functional API

Pure functions enable easy testing and debugging

Autodiff

Automatic differentiation for training

Scaling

Easy parallelization across GPUs/TPUs

Why gRPC?

gRPC provides efficient binary serialization (Protocol Buffers), bidirectional streaming, and strong typing through service definitions.
// Service definition
service ScoredPostsService {
  rpc GetScoredPosts(ScoredPostsQuery) returns (ScoredPostsResponse);
}

message ScoredPostsQuery {
  uint64 viewer_id = 1;
  string client_app_id = 2;
  string country_code = 3;
  repeated uint64 seen_ids = 4;
  bool in_network_only = 5;
}

message ScoredPostsResponse {
  repeated ScoredPost scored_posts = 1;
}

Configuration Management

The system uses a multi-layer configuration approach:
1

Default Configuration

Hard-coded defaults in the codebase
pub mod params {
    pub const MAX_GRPC_MESSAGE_SIZE: usize = 10 * 1024 * 1024; // 10MB
    pub const DEFAULT_RESULT_SIZE: usize = 50;
    pub const DEFAULT_HISTORY_LENGTH: usize = 32;
}
2

Command-Line Arguments

Override defaults at startup
#[derive(Parser, Debug)]
struct Args {
    #[arg(long)]
    grpc_port: u16,
    #[arg(long)]
    metrics_port: u16,
}
3

Dynamic Configuration

Runtime configuration changes through config service
  • Scoring weights
  • Filter thresholds
  • Model endpoints

Next Steps

Phoenix Deep Dive

Learn about the retrieval and ranking models in detail

Deployment Guide

Set up the system in your infrastructure

Customization

Add custom sources, filters, and scorers

Performance Tuning

Optimize latency and throughput

Build docs developers (and LLMs) love