Instead of predicting a single “relevance” score, Phoenix predicts probabilities for multiple engagement actions simultaneously. This multi-task learning approach enables the model to capture nuanced user preferences and allows flexible weighting of different engagement types during ranking.
Phoenix predicts probabilities for 14 distinct actions, spanning positive, passive, and negative engagement:
Predictions:├── Positive Actions (high value)│ ├── P(favorite) # User likes the post│ ├── P(reply) # User replies to the post│ ├── P(repost) # User reposts to their followers│ ├── P(quote) # User quote-tweets│ ├── P(share) # User shares externally│ └── P(follow_author) # User follows the author│├── Passive Actions (medium value)│ ├── P(click) # User clicks to view details│ ├── P(profile_click) # User clicks on author profile│ ├── P(video_view) # User watches video content│ ├── P(photo_expand) # User expands photo│ └── P(dwell) # User dwells on post (time spent)│└── Negative Actions (penalize) ├── P(not_interested) # User marks "not interested" ├── P(block_author) # User blocks the author ├── P(mute_author) # User mutes the author └── P(report) # User reports the post
The exact number of actions is configurable via PhoenixModelConfig.num_actions. The model architecture adapts automatically.
# Pseudo-codelogits = model(user, history, candidates) # [B, C, num_actions]probabilities = sigmoid(logits) # [B, C, num_actions]# Per-action binary cross-entropyloss = 0for action_idx in range(num_actions): loss += binary_cross_entropy( predictions=probabilities[:, :, action_idx], labels=labels[:, :, action_idx] )# Average across actions and candidatesloss = loss / (num_actions * num_candidates)
Implicit Feedback: Some actions (like clicks) are implicit. If a user liked a post, they must have clicked it, but clicks may not be explicitly labeled.
The model learns subtle differences in user intent:
# User A: High P(like), Low P(reply)# → Passive consumer, likes content but doesn't engage deeply# User B: High P(reply), Medium P(like)# → Active engager, participates in conversations# User C: High P(block_author), Low everything else# → Content is likely spam or offensive
In the input, historical actions are encoded using a learned projection:
phoenix/recsys_model.py
def _get_action_embeddings( self, actions: jax.Array, # [B, S, num_actions] multi-hot) -> jax.Array: """Convert multi-hot action vectors to embeddings. Uses a learned projection matrix to map the signed action vector to the embedding dimension. This works for any number of actions. """ config = self.config _, _, num_actions = actions.shape D = config.emb_size embed_init = hk.initializers.VarianceScaling(1.0, mode="fan_out") action_projection = hk.get_parameter( "action_projection", [num_actions, D], # [14, 256] dtype=jnp.float32, init=embed_init, ) # Convert binary {0,1} to signed {-1,+1} actions_signed = (2 * actions - 1).astype(jnp.float32) # Project: [B, S, num_actions] @ [num_actions, D] -> [B, S, D] action_emb = jnp.dot(actions_signed.astype(action_projection.dtype), action_projection) # Mask out invalid positions (where no action occurred) valid_mask = jnp.any(actions, axis=-1, keepdims=True) action_emb = action_emb * valid_mask return action_emb.astype(self.fprop_dtype)
Signed Encoding: Actions are converted from to before projection. This allows the model to learn both positive and negative directions for each action type.