Two-Stage Pipeline
Phoenix operates using a two-stage architecture that balances efficiency and accuracy:Retrieval Stage
Efficiently narrows down millions of candidates to hundreds using approximate nearest neighbor (ANN) search with a two-tower model
Key Features
Hash-Based Embeddings
Both models use multiple hash functions for efficient embedding lookup, enabling the system to handle sparse, high-cardinality features
Shared Architecture
The retrieval user tower uses the same transformer architecture as the ranking model, ensuring consistency across stages
Multi-Action Prediction
The ranking model predicts multiple engagement types simultaneously (like, repost, reply, click, etc.)
Candidate Isolation
Custom attention masking ensures candidates are scored independently, preventing batch composition effects
Transformer Foundation
The transformer implementation in Phoenix is based on the Grok-1 open source release by xAI, adapted for recommendation system use cases with custom input embeddings and attention masking.
- RMS Normalization for stable training
- Rotary Position Embeddings (RoPE) for position-aware attention
- Grouped Query Attention for efficient inference
- SwiGLU Activations in feed-forward layers
Input Features
Phoenix processes rich contextual information:- User features: User identifiers encoded via hash functions
- Engagement history: Posts the user has interacted with, including:
- Post content embeddings
- Author embeddings
- Action types (like, repost, reply)
- Product surface (Home, Search, etc.)
- Candidate posts: Items to rank, with post and author embeddings
Output Predictions
The ranking model produces a multi-action prediction tensor:Next Steps
Retrieval Model
Learn about the two-tower retrieval architecture
Ranking Model
Explore the transformer ranking model
Architecture Details
Deep dive into attention masking and implementation