Phoenix ML System Overview

Phoenix is a recommendation system that predicts user engagement (likes, reposts, replies, etc.) for content. It uses transformer-based architectures implemented in JAX to deliver personalized content recommendations at scale.

Two-Stage Pipeline

Phoenix operates using a two-stage architecture that balances efficiency and accuracy:

Retrieval Stage

Efficiently narrows down millions of candidates to hundreds using approximate nearest neighbor (ANN) search with a two-tower model

Ranking Stage

Scores and orders the retrieved candidates using a more expressive transformer model to produce the final ranked feed

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           RECOMMENDATION PIPELINE                               │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   ┌──────────┐     ┌─────────────────────┐     ┌─────────────────────┐          │
│   │          │     │                     │     │                     │          │
│   │   User   │────▶│   STAGE 1:          │────▶│   STAGE 2:          │────▶ Feed│
│   │ Request  │     │   RETRIEVAL         │     │   RANKING           │          │
│   │          │     │   (Two-Tower)       │     │   (Transformer)     │          │
│   └──────────┘     │                     │     │                     │          │
│                    │   Millions → 1000s  │     │   1000s → Ranked    │          │
│                    └─────────────────────┘     └─────────────────────┘          │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Key Features

Hash-Based Embeddings

Both models use multiple hash functions for efficient embedding lookup, enabling the system to handle sparse, high-cardinality features

Shared Architecture

The retrieval user tower uses the same transformer architecture as the ranking model, ensuring consistency across stages

Multi-Action Prediction

The ranking model predicts multiple engagement types simultaneously (like, repost, reply, click, etc.)

Candidate Isolation

Custom attention masking ensures candidates are scored independently, preventing batch composition effects

Transformer Foundation

The transformer implementation in Phoenix is based on the Grok-1 open source release by xAI, adapted for recommendation system use cases with custom input embeddings and attention masking.

The core architecture features:

RMS Normalization for stable training
Rotary Position Embeddings (RoPE) for position-aware attention
Grouped Query Attention for efficient inference
SwiGLU Activations in feed-forward layers

Input Features

Phoenix processes rich contextual information:

User features: User identifiers encoded via hash functions
Engagement history: Posts the user has interacted with, including:
- Post content embeddings
- Author embeddings
- Action types (like, repost, reply)
- Product surface (Home, Search, etc.)
Candidate posts: Items to rank, with post and author embeddings

Output Predictions

The ranking model produces a multi-action prediction tensor:

Output shape: [B, num_candidates, num_actions]
              │   │              └─ Like, Repost, Reply, Click, etc.
              │   └─ Each candidate post in the batch
              └─ Batch dimension

These predictions are used to compute final scores that determine content ordering in the feed.

Next Steps

Retrieval Model

Learn about the two-tower retrieval architecture

Ranking Model

Explore the transformer ranking model

Architecture Details

Deep dive into attention masking and implementation

Getting Started

Core Components

How It Works

Phoenix ML System

Implementation

Key Concepts

Two-Stage Pipeline

Key Features

Hash-Based Embeddings

Shared Architecture

Multi-Action Prediction

Candidate Isolation

Transformer Foundation

Input Features

Output Predictions

Next Steps

Retrieval Model

Ranking Model

Architecture Details

Build docs developers (and LLMs) love

Getting Started

Core Components

How It Works

Phoenix ML System

Implementation

Key Concepts

​Two-Stage Pipeline

​Key Features

Hash-Based Embeddings

Shared Architecture

Multi-Action Prediction

Candidate Isolation

​Transformer Foundation

​Input Features

​Output Predictions

​Next Steps

Retrieval Model

Ranking Model

Architecture Details

Build docs developers (and LLMs) love

Two-Stage Pipeline

Key Features

Transformer Foundation

Input Features

Output Predictions

Next Steps