JobTimer

Overview

JobTimer is a utility class that wraps Hadoop MapReduce job execution with automatic timing and performance logging. It records job execution times to both console output and a CSV file for analysis. Package: circlenet.common Source: /workspace/source/src/main/java/circlenet/common/JobTimer.java:9

Methods

run()

Executes a MapReduce job with timing and logging.

public static boolean run(Job job, String label) throws Exception

job

org.apache.hadoop.mapreduce.Job

required

The configured Hadoop MapReduce job to execute

label

String

required

A descriptive label for this job (used in logging and CSV output)

Returns: boolean - true if the job completed successfully, false otherwise Throws: Exception - Any exception from job execution

Behavior

Records start time
Calls job.waitForCompletion(true) to execute the job
Records end time and calculates duration
Prints timing to console
Appends timing data to CSV file
Returns job success status

Console Output

TaskA time_ms=45230 time_s=45.23

Example Usage

public static void main(String[] args) throws Exception {
    long totalStart = System.currentTimeMillis();
    
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "TaskA - Hobby Frequency");
    
    job.setJarByClass(TaskA.class);
    job.setMapperClass(TaskAMapper.class);
    job.setReducerClass(TaskAReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    
    boolean success = JobTimer.run(job, "TaskA");
    JobTimer.total("TaskA", totalStart);
    
    System.exit(success ? 0 : 1);
}

From TaskA.java:71-72 - single job execution.

total()

Records total execution time for a multi-job workflow.

public static void total(String label, long totalStartMs)

label

String

required

A descriptive label for the overall workflow

totalStartMs

long

required

The start time in milliseconds (from System.currentTimeMillis())

Returns: void

Behavior

Calculates elapsed time from totalStartMs to now
Prints total time to console
Appends total timing to CSV with phase="total" and success=true

Console Output

TaskBSimple total_time_ms=152847 total_time_s=152.847

Example Usage

public static void main(String[] args) throws Exception {
    long totalStart = System.currentTimeMillis();
    
    // Job 1
    Job job1 = Job.getInstance(conf, "TaskB Simple - Count Access");
    // ... configure job1 ...
    if (!JobTimer.run(job1, "TaskBSimple-job1-count")) {
        JobTimer.total("TaskBSimple", totalStart);
        System.exit(1);
    }
    
    // Job 2
    Job job2 = Job.getInstance(conf, "TaskB Simple - Join");
    // ... configure job2 ...
    if (!JobTimer.run(job2, "TaskBSimple-job2-join")) {
        JobTimer.total("TaskBSimple", totalStart);
        System.exit(1);
    }
    
    // Job 3
    Job job3 = Job.getInstance(conf, "TaskB Simple - Top10");
    // ... configure job3 ...
    boolean ok = JobTimer.run(job3, "TaskBSimple-job3-top10");
    JobTimer.total("TaskBSimple", totalStart);
    
    System.exit(ok ? 0 : 1);
}

From TaskBSimple.java:149-182 - multi-job workflow with total time tracking.

CSV Output Format

File Location

The CSV file path is determined by:

Environment variable CIRCLENET_TIMING_FILE (if set)
Default: task_times.csv in current directory

CSV Schema

timestamp,label,phase,time_ms,time_s,success
2026-03-01T14:23:45.123,TaskA,job,45230,45.23,1
2026-03-01T14:23:45.456,TaskA,total,45250,45.25,1
2026-03-01T14:25:12.789,TaskBSimple-job1-count,job,42100,42.1,1
2026-03-01T14:26:05.234,TaskBSimple-job2-join,job,63450,63.45,1
2026-03-01T14:27:13.567,TaskBSimple-job3-top10,job,47297,47.297,1
2026-03-01T14:27:13.590,TaskBSimple,total,152847,152.847,1

Field Descriptions

Field	Type	Description
`timestamp`	ISO DateTime	When the measurement was recorded
`label`	String	Job or workflow label
`phase`	String	Either `"job"` or `"total"`
`time_ms`	Long	Execution time in milliseconds
`time_s`	Double	Execution time in seconds (decimal)
`success`	Int	`1` for success, `0` for failure

CSV Features

Header Row: Automatically written when creating new file
Append Mode: New entries appended without overwriting
Thread Safety: Synchronized write method prevents race conditions
Error Handling: Write failures print to stderr but don’t stop job execution

Usage Patterns

Single Job Timing

long totalStart = System.currentTimeMillis();
Job job = Job.getInstance(conf, "My Job");
// Configure job...
boolean success = JobTimer.run(job, "MyJob");
JobTimer.total("MyJob", totalStart);
System.exit(success ? 0 : 1);

Multi-Job Pipeline

long totalStart = System.currentTimeMillis();

// Run jobs sequentially
if (!JobTimer.run(job1, "Pipeline-step1")) {
    JobTimer.total("Pipeline", totalStart);
    System.exit(1);
}

if (!JobTimer.run(job2, "Pipeline-step2")) {
    JobTimer.total("Pipeline", totalStart);
    System.exit(1);
}

boolean ok = JobTimer.run(job3, "Pipeline-step3");
JobTimer.total("Pipeline", totalStart);
System.exit(ok ? 0 : 1);

Custom CSV Location

Set environment variable before running:

export CIRCLENET_TIMING_FILE=/data/metrics/job_performance.csv
hadoop jar circlenet.jar circlenet.taskA.TaskA input output

Analyzing Performance

The CSV output can be analyzed using standard tools:

# View all job times
cat task_times.csv | column -t -s,

# Calculate average job time
awk -F, 'NR>1 && $3=="job" {sum+=$4; count++} END {print sum/count}' task_times.csv

# Find slowest jobs
sort -t, -k4 -nr task_times.csv | head -n 10

Design Notes

Utility Class: Constructor is private; all methods are static
Thread Safety: CSV append is synchronized to prevent concurrent write issues
Non-Blocking Errors: CSV write failures don’t crash the job
Console + File: Dual output ensures visibility during execution and historical analysis
ISO Timestamps: Uses LocalDateTime.toString() for standard date format

MapReduce Jobs

Utilities

Overview

Methods

run()

Behavior

Console Output

Example Usage

total()

Behavior

Console Output

Example Usage

CSV Output Format

File Location

CSV Schema

Field Descriptions

CSV Features

Usage Patterns

Single Job Timing

Multi-Job Pipeline

Custom CSV Location

Analyzing Performance

Design Notes

Build docs developers (and LLMs) love

MapReduce Jobs

Utilities

​Overview

​Methods

​run()

​Behavior

​Console Output

​Example Usage

​total()

​Behavior

​Console Output

​Example Usage

​CSV Output Format

​File Location

​CSV Schema

​Field Descriptions

​CSV Features

​Usage Patterns

​Single Job Timing

​Multi-Job Pipeline

​Custom CSV Location

​Analyzing Performance

​Design Notes

Build docs developers (and LLMs) love

Overview

Methods

run()

Behavior

Console Output

Example Usage

total()

Behavior

Console Output

Example Usage

CSV Output Format

File Location

CSV Schema

Field Descriptions

CSV Features

Usage Patterns

Single Job Timing

Multi-Job Pipeline

Custom CSV Location

Analyzing Performance

Design Notes