Skip to main content
Phoenix is a recommendation system that predicts user engagement (likes, reposts, replies, etc.) for content. It uses transformer-based architectures implemented in JAX to deliver personalized content recommendations at scale.

Two-Stage Pipeline

Phoenix operates using a two-stage architecture that balances efficiency and accuracy:
1

Retrieval Stage

Efficiently narrows down millions of candidates to hundreds using approximate nearest neighbor (ANN) search with a two-tower model
2

Ranking Stage

Scores and orders the retrieved candidates using a more expressive transformer model to produce the final ranked feed
┌─────────────────────────────────────────────────────────────────────────────────┐
│                           RECOMMENDATION PIPELINE                               │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   ┌──────────┐     ┌─────────────────────┐     ┌─────────────────────┐          │
│   │          │     │                     │     │                     │          │
│   │   User   │────▶│   STAGE 1:          │────▶│   STAGE 2:          │────▶ Feed│
│   │ Request  │     │   RETRIEVAL         │     │   RANKING           │          │
│   │          │     │   (Two-Tower)       │     │   (Transformer)     │          │
│   └──────────┘     │                     │     │                     │          │
│                    │   Millions → 1000s  │     │   1000s → Ranked    │          │
│                    └─────────────────────┘     └─────────────────────┘          │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Key Features

Hash-Based Embeddings

Both models use multiple hash functions for efficient embedding lookup, enabling the system to handle sparse, high-cardinality features

Shared Architecture

The retrieval user tower uses the same transformer architecture as the ranking model, ensuring consistency across stages

Multi-Action Prediction

The ranking model predicts multiple engagement types simultaneously (like, repost, reply, click, etc.)

Candidate Isolation

Custom attention masking ensures candidates are scored independently, preventing batch composition effects

Transformer Foundation

The transformer implementation in Phoenix is based on the Grok-1 open source release by xAI, adapted for recommendation system use cases with custom input embeddings and attention masking.
The core architecture features:
  • RMS Normalization for stable training
  • Rotary Position Embeddings (RoPE) for position-aware attention
  • Grouped Query Attention for efficient inference
  • SwiGLU Activations in feed-forward layers

Input Features

Phoenix processes rich contextual information:
  • User features: User identifiers encoded via hash functions
  • Engagement history: Posts the user has interacted with, including:
    • Post content embeddings
    • Author embeddings
    • Action types (like, repost, reply)
    • Product surface (Home, Search, etc.)
  • Candidate posts: Items to rank, with post and author embeddings

Output Predictions

The ranking model produces a multi-action prediction tensor:
Output shape: [B, num_candidates, num_actions]
              │   │              └─ Like, Repost, Reply, Click, etc.
              │   └─ Each candidate post in the batch
              └─ Batch dimension
These predictions are used to compute final scores that determine content ordering in the feed.

Next Steps

Retrieval Model

Learn about the two-tower retrieval architecture

Ranking Model

Explore the transformer ranking model

Architecture Details

Deep dive into attention masking and implementation

Build docs developers (and LLMs) love