Overview
Task C filters CircleNet pages to find all users with a specific hobby (e.g., “PodcastBinging”).
TaskCSimple
Package: circlenet.taskC
Class: TaskCSimple
Source: src/main/java/circlenet/taskC/TaskCSimple.java
Main Method
public static void main(String[] args) throws Exception
Command-Line Arguments
Input path to the CircleNet Pages CSV file
Target hobby to filter (e.g., “PodcastBinging”)
Mapper: MapperC
Filters pages by hobby using case-insensitive comparison. The target hobby is configured via setup().
public static class MapperC extends Mapper<LongWritable, Text, Text, Text> {
private String targetHobby;
private final Text outKey = new Text();
private final Text outVal = new Text();
@Override
protected void setup(Context context) {
targetHobby = context.getConfiguration().get("task.c.hobby", "").trim();
}
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String[] f = CsvUtils.split(value.toString());
if (f.length >= 5 && f[4].trim().equalsIgnoreCase(targetHobby)) {
outKey.set(f[1].trim());
outVal.set(f[2].trim());
context.write(outKey, outVal);
}
}
}
Reducer: PassReducer
Pass-through reducer that outputs all filtered results.
public static class PassReducer extends Reducer<Text, Text, Text, Text> {
@Override
protected void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
for (Text v : values) {
context.write(key, v);
}
}
}
Configuration
The target hobby is passed via Hadoop configuration:
Configuration conf = new Configuration();
conf.set("task.c.hobby", args[2]);
Example Usage
hadoop jar $JAR circlenet.taskC.TaskCSimple $PAGES $OUT/taskC/simple PodcastBinging
Notes
- An optimized version was implemented but showed no performance gain
- The simple version is recommended for production use