Skip to main content
The ActivityLog dataset records all user interactions with CircleNetPages. Each record represents a single action performed by one user on another user’s page, including the type of action and when it occurred.

Overview

  • Total Records: 10,000,000 actions
  • File Format: CSV without headers
  • Primary Key: ActionId
  • Purpose: Track user engagement and page interaction patterns

Schema Definition

ActionId
integer
required
Unique sequential identifier for each action.
  • Range: 1 to 10,000,000
  • Constraint: Must be unique
  • Purpose: Primary key for the activity record
ByWho
integer
required
The user ID of the person performing the action.
  • Range: 1 to 200,000
  • Constraint: Must exist in CircleNetPage.ID
  • Role: The actor who initiated the activity
WhatPage
integer
required
The user ID of the page that was accessed or interacted with.
  • Range: 1 to 200,000
  • Constraint: Must exist in CircleNetPage.ID
  • Note: Can be the same as ByWho (users can access their own pages)
ActionType
string
required
Description of the action performed on the page.
  • Length: 20-50 characters
  • Format: No commas allowed
  • Examples: “viewed profile page”, “left a comment on recent post”, “poked user”, “liked profile photo”, “sent friend request”
ActionTime
integer
required
Timestamp of when the action occurred.
  • Range: 1 to 1,000,000
  • Format: Sequential integer (hour granularity)
  • Purpose: Enables temporal analysis and inactivity detection

Example Records

1,1523,47892,viewed profile page,456789
2,1523,47892,left a comment on recent post,456790
3,47892,1523,viewed profile page,456850
4,47892,1523,poked user playfully,456851
5,12345,67890,viewed profile page,234567
6,12345,67890,liked profile photo and banner,234568
The file does not include column headers. The order of values corresponds to: ActionId, ByWho, WhatPage, ActionType, ActionTime.

Action Type Rules

Critical Constraint: Any action other than “viewed” must be preceded by a “viewed” action.

Valid Action Sequence

# Correct: View first, then interact
100,1523,47892,viewed profile page,456789
101,1523,47892,left a comment on recent post,456790
102,1523,47892,liked profile photo,456791

# Incorrect: Interaction without prior view
103,1523,47892,left a comment on recent post,456792  # ERROR: No view action first

Action Type Categories

View Actions (must come first):
  • “viewed profile page”
  • “viewed photos section”
  • “viewed recent posts feed”
Interaction Actions (require prior view):
  • “left a comment on recent post”
  • “poked user playfully”
  • “liked profile photo and banner”
  • “sent friend request”
  • “shared post to own timeline”
  • “reacted with emoji to status”
  • “sent private message”
  • “tagged in a photo comment”
Be creative with ActionType descriptions while maintaining realism. Think about actual social media interactions.

Data Characteristics

Self-Interaction

Unlike the Follows dataset, users CAN interact with their own pages:
# Valid: User 1523 views their own page
200,1523,1523,viewed profile page,456789

Temporal Granularity

  • ActionTime represents hours (granularity of 1 hour)
  • Range 1-1,000,000 represents approximately 114 years of hourly data
  • Used for detecting inactive users (e.g., no activity in 90 days = 2,160 hours)

Referential Integrity

Both ByWho and WhatPage must reference valid CircleNetPage IDs:
ByWho, WhatPage ∈ CircleNetPage.ID

Analytics Use Cases

This dataset enables:

Popularity Metrics (Task B)

Count total accesses per page to find the 10 most popular CircleNetPages.

User Engagement (Task E)

For each user, calculate:
  • Total number of actions performed
  • Number of distinct pages accessed
  • Identify users with “favorites” (frequent page visits)

Inactivity Detection (Task G)

Identify users with no ActivityLog entries in the last 90 days (2,160 hours).

Activity Patterns

  • Peak usage times
  • Interaction type distributions
  • User engagement levels
  • Page visit frequency

Scale Considerations

With 10 million actions across 200,000 users:
  • Average of 50 actions per user (as actor)
  • Average of 50 actions per page (as target)
  • Distribution will vary (active users generate more actions)
  • Popular pages receive more views and interactions

Generation Requirements

When generating ActivityLog data:
  1. Sequence ActionIds from 1 to 10,000,000
  2. Randomize ByWho and WhatPage within valid range (1-200,000)
  3. Ensure view-first rule: For each (ByWho, WhatPage) pair, first action must be a view
  4. Randomize ActionTime within range (1-1,000,000)
  5. Create realistic ActionType descriptions (20-50 chars, no commas)
  6. Vary action types to simulate realistic usage patterns
The view-first constraint is critical for data integrity. Your generator must track which (ByWho, WhatPage) pairs have had a view action before allowing interaction actions.

Next Steps

Dataset Overview

See how all datasets connect

Generate Datasets

Create the ActivityLog dataset

Build docs developers (and LLMs) love