Available Tasks
Task A: Hobby Frequency
Count the frequency of each favorite hobby on CircleNet
Task B: Popular Pages
Find the top 10 most accessed CircleNet pages
Task C: Hobby Filter
Filter users by a specific favorite hobby
Task D: Popularity Factor
Calculate follower count for each CircleNet page owner
Task E: Favorites Analysis
Analyze total actions and distinct pages accessed per user
Task F: Above Average
Identify users with more followers than average
Task G: Outdated Pages
Find users with no activity in the last 90 days
Task H: One-Way Follows
Detect same-region one-way follow relationships
Optimization Techniques
All tasks implement both simple and optimized approaches:- Combiners: Reduce shuffle I/O by pre-aggregating data at the mapper
- Map-Side Joins: Load small datasets into memory for efficient joins
- Map-Only Jobs: Skip reduce phase when possible to save I/O costs
- Job Chaining: Minimize the number of sequential MapReduce jobs