Overview
JobTimer is a utility class that wraps Hadoop MapReduce job execution with automatic timing and performance logging. It records job execution times to both console output and a CSV file for analysis.
Package: circlenet.common
Source: /workspace/source/src/main/java/circlenet/common/JobTimer.java:9
Methods
run()
Executes a MapReduce job with timing and logging.
public static boolean run(Job job, String label) throws Exception
job
org.apache.hadoop.mapreduce.Job
required
The configured Hadoop MapReduce job to execute
A descriptive label for this job (used in logging and CSV output)
Returns: boolean - true if the job completed successfully, false otherwise
Throws: Exception - Any exception from job execution
Behavior
- Records start time
- Calls
job.waitForCompletion(true) to execute the job
- Records end time and calculates duration
- Prints timing to console
- Appends timing data to CSV file
- Returns job success status
Console Output
TaskA time_ms=45230 time_s=45.23
Example Usage
public static void main(String[] args) throws Exception {
long totalStart = System.currentTimeMillis();
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "TaskA - Hobby Frequency");
job.setJarByClass(TaskA.class);
job.setMapperClass(TaskAMapper.class);
job.setReducerClass(TaskAReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
boolean success = JobTimer.run(job, "TaskA");
JobTimer.total("TaskA", totalStart);
System.exit(success ? 0 : 1);
}
From TaskA.java:71-72 - single job execution.
total()
Records total execution time for a multi-job workflow.
public static void total(String label, long totalStartMs)
A descriptive label for the overall workflow
The start time in milliseconds (from System.currentTimeMillis())
Returns: void
Behavior
- Calculates elapsed time from
totalStartMs to now
- Prints total time to console
- Appends total timing to CSV with
phase="total" and success=true
Console Output
TaskBSimple total_time_ms=152847 total_time_s=152.847
Example Usage
public static void main(String[] args) throws Exception {
long totalStart = System.currentTimeMillis();
// Job 1
Job job1 = Job.getInstance(conf, "TaskB Simple - Count Access");
// ... configure job1 ...
if (!JobTimer.run(job1, "TaskBSimple-job1-count")) {
JobTimer.total("TaskBSimple", totalStart);
System.exit(1);
}
// Job 2
Job job2 = Job.getInstance(conf, "TaskB Simple - Join");
// ... configure job2 ...
if (!JobTimer.run(job2, "TaskBSimple-job2-join")) {
JobTimer.total("TaskBSimple", totalStart);
System.exit(1);
}
// Job 3
Job job3 = Job.getInstance(conf, "TaskB Simple - Top10");
// ... configure job3 ...
boolean ok = JobTimer.run(job3, "TaskBSimple-job3-top10");
JobTimer.total("TaskBSimple", totalStart);
System.exit(ok ? 0 : 1);
}
From TaskBSimple.java:149-182 - multi-job workflow with total time tracking.
File Location
The CSV file path is determined by:
- Environment variable
CIRCLENET_TIMING_FILE (if set)
- Default:
task_times.csv in current directory
CSV Schema
timestamp,label,phase,time_ms,time_s,success
2026-03-01T14:23:45.123,TaskA,job,45230,45.23,1
2026-03-01T14:23:45.456,TaskA,total,45250,45.25,1
2026-03-01T14:25:12.789,TaskBSimple-job1-count,job,42100,42.1,1
2026-03-01T14:26:05.234,TaskBSimple-job2-join,job,63450,63.45,1
2026-03-01T14:27:13.567,TaskBSimple-job3-top10,job,47297,47.297,1
2026-03-01T14:27:13.590,TaskBSimple,total,152847,152.847,1
Field Descriptions
| Field | Type | Description |
|---|
timestamp | ISO DateTime | When the measurement was recorded |
label | String | Job or workflow label |
phase | String | Either "job" or "total" |
time_ms | Long | Execution time in milliseconds |
time_s | Double | Execution time in seconds (decimal) |
success | Int | 1 for success, 0 for failure |
CSV Features
- Header Row: Automatically written when creating new file
- Append Mode: New entries appended without overwriting
- Thread Safety: Synchronized write method prevents race conditions
- Error Handling: Write failures print to stderr but don’t stop job execution
Usage Patterns
Single Job Timing
long totalStart = System.currentTimeMillis();
Job job = Job.getInstance(conf, "My Job");
// Configure job...
boolean success = JobTimer.run(job, "MyJob");
JobTimer.total("MyJob", totalStart);
System.exit(success ? 0 : 1);
Multi-Job Pipeline
long totalStart = System.currentTimeMillis();
// Run jobs sequentially
if (!JobTimer.run(job1, "Pipeline-step1")) {
JobTimer.total("Pipeline", totalStart);
System.exit(1);
}
if (!JobTimer.run(job2, "Pipeline-step2")) {
JobTimer.total("Pipeline", totalStart);
System.exit(1);
}
boolean ok = JobTimer.run(job3, "Pipeline-step3");
JobTimer.total("Pipeline", totalStart);
System.exit(ok ? 0 : 1);
Custom CSV Location
Set environment variable before running:
export CIRCLENET_TIMING_FILE=/data/metrics/job_performance.csv
hadoop jar circlenet.jar circlenet.taskA.TaskA input output
The CSV output can be analyzed using standard tools:
# View all job times
cat task_times.csv | column -t -s,
# Calculate average job time
awk -F, 'NR>1 && $3=="job" {sum+=$4; count++} END {print sum/count}' task_times.csv
# Find slowest jobs
sort -t, -k4 -nr task_times.csv | head -n 10
Design Notes
- Utility Class: Constructor is private; all methods are static
- Thread Safety: CSV append is synchronized to prevent concurrent write issues
- Non-Blocking Errors: CSV write failures don’t crash the job
- Console + File: Dual output ensures visibility during execution and historical analysis
- ISO Timestamps: Uses
LocalDateTime.toString() for standard date format