Overview
The For You feed algorithm processes every request through a sequence of well-defined stages. Each stage transforms the data, enriching candidates with additional information, filtering out ineligible content, and ultimately producing a ranked list of posts.Stage Flow
The pipeline executes the following stages in order:1. Query Hydration
Purpose: Load user context required for personalized recommendations.What Gets Hydrated
- User Action Sequence: Recent engagement history (likes, reposts, replies, clicks, etc.)
- User Features: Following list, blocked/muted accounts, muted keywords, preferences
- Bloom Filters: Efficient data structure for previously seen posts
- Seen IDs: Posts the user has explicitly seen in recent sessions
2. Candidate Sourcing
Candidates are retrieved from two parallel sources:Thunder (In-Network)
Retrieves recent posts from accounts the user follows:- Original posts
- Replies
- Reposts
- Video posts
Phoenix Retrieval (Out-of-Network)
Uses a two-tower ML model to find relevant posts from the global corpus:- User Tower: Encodes user engagement history into an embedding
- Candidate Tower: Encodes all posts into embeddings
- Similarity Search: Retrieves top-K posts via dot product similarity
3. Candidate Hydration
Purpose: Enrich candidates with additional metadata needed for filtering and scoring.Hydrated Data
- Core Post Data: Text, media URLs, timestamps
- Author Information: Username, display name, verification status
- Video Duration: For video posts (used in scoring eligibility)
- Subscription Status: Whether the post requires paid subscription
- Visibility Information: Safety labels, spam detection results
4. Pre-Scoring Filters
Purpose: Remove candidates that should never be scored or shown. Filters run sequentially. See Filtering for complete details on each filter. Key filters include:- Duplicate removal
- Age filtering
- Self-post removal
- Blocked/muted author filtering
- Previously seen post filtering
5. Scoring
Purpose: Predict user engagement and compute relevance scores. Scorers apply sequentially:Phoenix Scorer
The Grok-based transformer model predicts probabilities for multiple engagement types:
- P(favorite), P(reply), P(repost)
- P(click), P(profile_click), P(video_view)
- P(share), P(dwell), P(follow_author)
- P(not_interested), P(block_author), P(mute_author), P(report)
Weighted Scorer
Combines predictions into a single relevance score using weighted formula:Positive actions have positive weights, negative actions have negative weights.
Author Diversity Scorer
Attenuates scores for repeated authors to ensure feed diversity:The first post from an author gets full score, subsequent posts are attenuated.
6. Selection
Purpose: Sort candidates by final score and select the top K. The selector:- Sorts all scored candidates in descending order by score
- Selects the top K candidates (typically 100-500 depending on request parameters)
- Passes selected candidates to post-selection filters
7. Post-Selection Filtering
Purpose: Final validation before serving to the user. Post-selection filters include:Visibility Filter (VF)
Removes posts that are deleted, spam, violence, gore, or otherwise violate content policies.
Conversation Deduplication
Deduplicates multiple branches of the same conversation thread to avoid repetitive content.
Side Effects
After the main pipeline completes, side effects run asynchronously:- Cache request information for future use
- Log served candidates for downstream analytics
- Update Bloom filters with newly served post IDs
Performance Characteristics
The entire pipeline typically completes in 50-150ms from request to response.
- Query Hydration: 5-10ms
- Candidate Sourcing: 10-30ms (parallel execution)
- Candidate Hydration: 15-40ms (parallel execution)
- Filters: 5-15ms (sequential)
- Scoring: 20-50ms (Phoenix model inference)
- Selection & Post-Filters: 5-10ms
Implementation
The pipeline is implemented using thecandidate-pipeline framework, which provides traits for each stage:
- Separation of concerns: Each stage has a single responsibility
- Parallel execution: Independent operations run concurrently
- Graceful error handling: Failures in non-critical stages don’t crash the pipeline
- Easy testing: Each stage can be tested in isolation
Related Pages
Scoring and Ranking
Deep dive into the Phoenix model and weighted scoring formula
Filtering
Complete reference of all pre-scoring and post-selection filters
Phoenix Architecture
Transformer architecture with candidate isolation
Thunder
In-memory post store for in-network candidates