CVAT provides comprehensive quality control features to ensure annotation accuracy and consistency. Quality control includes validation workflows, ground truth comparison, consensus annotation, and automated quality metrics.
Overview
CVAT’s quality system includes:
Quality settings : Configurable comparison parameters
Ground truth jobs : Reference annotations for validation
Quality reports : Automated comparison and metrics
Validation workflows : Review and approval stages
Consensus annotation : Multiple annotators for agreement analysis
Honeypot tasks : Hidden validation frames
Quality Settings
Quality settings define how annotations are compared and validated.
Creating Quality Settings
For a project:
from cvat_sdk import Client, models
client = Client( url = "https://app.cvat.ai" )
client.login(( "username" , "password" ))
# Create quality settings for a project
quality_settings = client.api_client.quality_api.create_settings(
quality_settings_request = models.QualitySettingsRequest(
project_id = 1 ,
iou_threshold = 0.5 ,
oks_sigma = 0.09 ,
line_thickness = 0.01 ,
low_overlap_threshold = 0.3 ,
compare_line_orientation = True ,
line_orientation_threshold = 0.1 ,
compare_groups = True ,
group_match_threshold = 0.5 ,
check_covered_annotations = True ,
object_visibility_threshold = 0.5 ,
panoptic_comparison = False ,
compare_attributes = True ,
target_metric = "accuracy" ,
target_metric_threshold = 0.7 ,
max_validations_per_job = 3
)
)
print ( f "Quality settings created with ID: { quality_settings[ 0 ].id } " )
For a task:
quality_settings = client.api_client.quality_api.create_settings(
quality_settings_request = models.QualitySettingsRequest(
task_id = 10 ,
inherit = False , # Don't inherit from project
iou_threshold = 0.6 ,
target_metric = "precision" ,
target_metric_threshold = 0.8
)
)
Using the REST API:
curl -X POST "https://app.cvat.ai/api/quality/settings" \
-H "Authorization: Token <your-token>" \
-H "Content-Type: application/json" \
-d '{
"project_id": 1,
"iou_threshold": 0.5,
"oks_sigma": 0.09,
"target_metric": "accuracy",
"target_metric_threshold": 0.7,
"compare_attributes": true
}'
Quality Setting Parameters
Shape Matching Parameters
Parameter Type Default Description iou_thresholdfloat 0.4 IoU threshold for shape matching (0-1) oks_sigmafloat 0.09 OKS sigma for point matching (0-1) line_thicknessfloat 0.01 Thickness for polyline matching low_overlap_thresholdfloat 0.3 Threshold for low-overlap conflicts
Comparison Options
Parameter Type Default Description compare_line_orientationboolean true Compare polyline direction line_orientation_thresholdfloat 0.1 Min IoU gain for direction mismatch compare_groupsboolean true Compare annotation grouping group_match_thresholdfloat 0.5 Min IoU for group matching check_covered_annotationsboolean true Detect covered annotations object_visibility_thresholdfloat 0.5 Min visibility for coverage check panoptic_comparisonboolean false Use panoptic segmentation comparison compare_attributesboolean true Compare attribute values
Quality Targets
Parameter Type Default Description target_metricstring ”accuracy” Primary metric: accuracy, precision, recall target_metric_thresholdfloat 0.7 Minimum quality threshold (0-1) max_validations_per_jobinteger 0 Max validation attempts (0 = unlimited)
Advanced Options
Parameter Type Default Description inheritboolean true Inherit project settings (task-level only) empty_is_annotatedboolean false Treat empty frames as annotated point_size_basestring ”group_bbox_size” Point size reference: image_size or group_bbox_size job_filterstring {"==": [{"var": "type"}, "annotation"]}Filter jobs for validation
Updating Quality Settings
# Get existing settings
settings_list = client.api_client.quality_api.list_settings( project_id = 1 )
settings = settings_list.results[ 0 ]
# Update settings
client.api_client.quality_api.partial_update_settings(
id = settings.id,
patched_quality_settings_request = models.PatchedQualitySettingsRequest(
iou_threshold = 0.6 ,
target_metric_threshold = 0.75
)
)
Ground Truth Validation
Ground truth jobs provide reference annotations for quality comparison.
Creating Ground Truth Jobs
Random frame selection:
# Create GT job with random frames
gt_job = client.jobs.create(
spec = models.JobWriteRequest(
task_id = 10 ,
type = "ground_truth" ,
frame_selection_method = "random_uniform" ,
frame_count = 50 , # 50 random frames
random_seed = 42 # Reproducible selection
)
)
print ( f "Created GT job { gt_job.id } with { len (gt_job.segment.frames) } frames" )
Random per job:
# Random frames from each job
gt_job = client.jobs.create(
spec = models.JobWriteRequest(
task_id = 10 ,
type = "ground_truth" ,
frame_selection_method = "random_per_job" ,
frames_per_job_count = 10 , # 10 frames per job
random_seed = 42
)
)
Manual frame selection:
# Specific frames
gt_job = client.jobs.create(
spec = models.JobWriteRequest(
task_id = 10 ,
type = "ground_truth" ,
frame_selection_method = "manual" ,
frames = [ 0 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 ]
)
)
Using the API:
curl -X POST "https://app.cvat.ai/api/jobs" \
-H "Authorization: Token <your-token>" \
-H "Content-Type: application/json" \
-d '{
"task_id": 10,
"type": "ground_truth",
"frame_selection_method": "random_uniform",
"frame_count": 50,
"random_seed": 42
}'
Each task can have only one ground truth job. The GT job must be annotated separately to serve as the quality reference.
GT Job Frame Selection
Method Parameters Description random_uniformframe_count or frame_shareRandom frames across entire task random_per_jobframes_per_job_count or frames_per_job_shareRandom frames from each job manualframes (list)Manually specified frame indices
Honeypot Validation
Honeypot tasks contain hidden validation frames mixed with regular frames.
Creating Honeypot Tasks
Configure validation during task creation:
# Create task with honeypot frames
task = client.tasks.create(
spec = models.TaskWriteRequest(
name = "Honeypot Task" ,
project_id = 1 ,
segment_size = 100
)
)
# Upload data with validation params
task.upload_data(
resources = image_paths,
params = {
"validation_params" : {
"mode" : "gt_pool" ,
"frame_selection_method" : "random_per_job" ,
"frames_per_job_count" : 10 # 10 honeypot frames per job
}
}
)
Validation modes:
gt: Standard ground truth validation
gt_pool: Honeypot validation with hidden frames
Managing Honeypot Frames
# Update honeypot frames for a job
client.api_client.jobs_api.partial_update_validation_layout(
id = job_id,
patched_job_validation_layout_write_request = {
"frame_selection_method" : "manual" ,
"honeypot_real_frames" : [ 5 , 15 , 25 , 35 , 45 ]
}
)
Honeypot frames are only supported in 2D image tasks. Video tasks and 3D tasks cannot use honeypot validation.
Quality Reports
Quality reports compare job annotations against ground truth.
Creating Quality Reports
For a job:
# Create quality report for a job
report_data = client.api_client.quality_api.create_report(
quality_report_create_request = models.QualityReportCreateRequest(
job_id = 15
)
)
report = report_data[ 0 ]
print ( f "Quality report ID: { report.id } " )
print ( f "Accuracy: { report.summary[ 'accuracy' ] :.2%} " )
print ( f "Precision: { report.summary[ 'precision' ] :.2%} " )
print ( f "Recall: { report.summary[ 'recall' ] :.2%} " )
For a task:
# Generate report for entire task
report = client.api_client.quality_api.create_report(
quality_report_create_request = models.QualityReportCreateRequest(
task_id = 10
)
)[ 0 ]
For a project:
# Generate project-wide quality report
report = client.api_client.quality_api.create_report(
quality_report_create_request = models.QualityReportCreateRequest(
project_id = 1
)
)[ 0 ]
Using the API:
# Create job quality report
curl -X POST "https://app.cvat.ai/api/quality/reports" \
-H "Authorization: Token <your-token>" \
-H "Content-Type: application/json" \
-d '{"task_id": 10}'
Quality Report Structure
A quality report includes:
{
"id" : 1 ,
"job_id" : 15 ,
"task_id" : 10 ,
"project_id" : 1 ,
"parent_id" : null,
"target" : "job" ,
"created_date" : "2026-03-04T12:00:00Z" ,
"target_last_updated" : "2026-03-04T11:00:00Z" ,
"gt_last_updated" : "2026-03-04T10:00:00Z" ,
"summary" : {
"frame_count" : 50 ,
"frame_share" : 0.5 ,
"conflict_count" : 15 ,
"warning_count" : 8 ,
"error_count" : 7 ,
"conflicts_by_type" : {
"missing_annotation" : 5 ,
"extra_annotation" : 4 ,
"low_overlap" : 3 ,
"mismatching_label" : 2 ,
"mismatching_attributes" : 1
},
"annotations" : {
"valid_count" : 85 ,
"ds_count" : 90 ,
"gt_count" : 95 ,
"total_count" : 100 ,
"accuracy" : 0.85 ,
"precision" : 0.944 ,
"recall" : 0.895
}
}
}
Quality Metrics
Accuracy : valid_count / total_count
Percentage of correctly annotated objects
Considers both false positives and false negatives
Precision : valid_count / ds_count
Percentage of annotated objects that are correct
High precision = few false positives
Recall : valid_count / gt_count
Percentage of ground truth objects found
High recall = few false negatives
Conflict Types
Conflict Type Description missing_annotationGround truth object not annotated extra_annotationAnnotated object not in ground truth mismatching_labelWrong class label low_overlapIoU below threshold but above minimum mismatching_directionPolyline direction reversed mismatching_attributesIncorrect attribute values mismatching_groupsWrong object grouping covered_annotationObject obscured by another annotation
Retrieving Quality Reports
# List all quality reports for a job
reports = client.api_client.quality_api.list_reports( job_id = 15 )
for report in reports.results:
print ( f "Report { report.id } : Accuracy { report.summary[ 'accuracy' ] :.2%} " )
# Get specific report
report = client.api_client.quality_api.retrieve_report( id = 1 )
# Get conflicts for a report
conflicts = client.api_client.quality_api.list_conflicts( report_id = 1 )
for conflict in conflicts.results:
print ( f "Frame { conflict.frame } : { conflict.type } ( { conflict.severity } )" )
for ann_id in conflict.annotation_ids:
print ( f " Annotation { ann_id.obj_id } in job { ann_id.job_id } " )
Using the API:
# List reports
curl "https://app.cvat.ai/api/quality/reports?task_id=10" \
-H "Authorization: Token <your-token>"
# Get report details
curl "https://app.cvat.ai/api/quality/reports/1" \
-H "Authorization: Token <your-token>"
# Get conflicts
curl "https://app.cvat.ai/api/quality/conflicts?report_id=1" \
-H "Authorization: Token <your-token>"
Consensus Annotation
Consensus annotation assigns the same data to multiple annotators for agreement analysis.
Creating Consensus Tasks
# Create task with consensus replicas
task = client.tasks.create(
spec = models.TaskWriteRequest(
name = "Consensus Task" ,
project_id = 1 ,
segment_size = 100 ,
consensus_replicas = 3 # 3 annotators per segment
)
)
print ( f "Created task with { task.consensus_replicas } consensus replicas" )
This creates 3 jobs for each segment, allowing multiple annotators to work on identical data.
Using the API:
curl -X POST "https://app.cvat.ai/api/tasks" \
-H "Authorization: Token <your-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "Consensus Task",
"project_id": 1,
"segment_size": 100,
"consensus_replicas": 3
}'
Analyzing Consensus
Compare annotations between consensus replicas:
# Get all consensus jobs
task = client.tasks.retrieve( task_id = 10 )
jobs = task.get_jobs()
# Filter consensus replica jobs
consensus_jobs = [
job for job in jobs
if job.type == "consensus_replica"
]
# Create quality reports between replicas
for i, job1 in enumerate (consensus_jobs):
for job2 in consensus_jobs[i + 1 :]:
# Create cross-validation report
# (requires setting one job as GT temporarily)
print ( f "Comparing job { job1.id } vs { job2.id } " )
Consensus replicas must be between 2 and the configured maximum (default: 10). Setting consensus_replicas=0 disables consensus annotation.
Validation Workflows
Job Stages and States
Jobs progress through workflow stages:
Stages:
annotation: Initial annotation phase
validation: Review and quality check
acceptance: Final approval
States:
new: Not started
in progress: Work ongoing
completed: Finished
rejected: Needs rework
Moving Jobs Through Workflow
# Move job to validation stage
job = client.jobs.retrieve( job_id = 15 )
job.update(models.PatchedJobWriteRequest(
stage = "validation" ,
state = "new"
))
# Approve job (move to acceptance)
job.update(models.PatchedJobWriteRequest(
stage = "acceptance" ,
state = "completed"
))
# Reject job (send back to annotation)
job.update(models.PatchedJobWriteRequest(
stage = "annotation" ,
state = "rejected"
))
Automatic Acceptance
Configure automatic acceptance based on quality thresholds:
quality_settings = client.api_client.quality_api.create_settings(
quality_settings_request = models.QualitySettingsRequest(
task_id = 10 ,
target_metric = "accuracy" ,
target_metric_threshold = 0.85 ,
max_validations_per_job = 3 # Max 3 attempts
)
)
If job quality exceeds target_metric_threshold, it can be automatically accepted. After max_validations_per_job attempts, manual review is required.
Best Practices
Configure quality settings early
Use ground truth strategically
Create GT jobs with 5-10% of total frames
Use random_per_job for balanced coverage
Ensure GT annotations are high quality
Review GT jobs with domain experts
Update GT as annotation guidelines evolve
Implement validation workflows
Assign jobs to annotators in annotation stage
Move to validation stage for quality review
Use quality reports to identify issues
Provide feedback before rejecting jobs
Track metrics over time to measure improvement
Use consensus annotation wisely
Enable consensus for critical or ambiguous data
Use 2-3 replicas (more is usually unnecessary)
Analyze agreement to identify labeling issues
Refine guidelines based on disagreements
Balance cost vs quality requirements
Generate reports regularly during annotation
Track accuracy, precision, and recall trends
Investigate common conflict types
Identify struggling annotators for training
Adjust quality settings based on results
Quality Control Workflow Example
Complete quality control setup:
from cvat_sdk import Client, models
client = Client( url = "https://app.cvat.ai" )
client.login(( "username" , "password" ))
# 1. Create project with labels
project = client.projects.create(
spec = models.ProjectWriteRequest(
name = "Quality Controlled Project" ,
labels = [ ... ]
)
)
# 2. Configure quality settings
quality_settings = client.api_client.quality_api.create_settings(
quality_settings_request = models.QualitySettingsRequest(
project_id = project.id,
iou_threshold = 0.5 ,
target_metric = "accuracy" ,
target_metric_threshold = 0.85 ,
max_validations_per_job = 3 ,
compare_attributes = True
)
)
# 3. Create task
task = client.tasks.create(
spec = models.TaskWriteRequest(
name = "Annotate Batch 1" ,
project_id = project.id,
segment_size = 100 ,
overlap = 5
)
)
# 4. Upload data
task.upload_data( resources = image_paths)
# 5. Create ground truth job
gt_job = client.jobs.create(
spec = models.JobWriteRequest(
task_id = task.id,
type = "ground_truth" ,
frame_selection_method = "random_per_job" ,
frames_per_job_count = 10
)
)
print ( "Quality control setup complete!" )
print ( f "Project ID: { project.id } " )
print ( f "Task ID: { task.id } " )
print ( f "GT Job ID: { gt_job.id } " )
print ( " \n Next steps:" )
print ( "1. Annotate ground truth job with expert annotator" )
print ( "2. Assign annotation jobs to team" )
print ( "3. Generate quality reports during annotation" )
print ( "4. Review and provide feedback" )
Next Steps
Creating Projects Set up new projects with quality settings
Managing Tasks Create and manage tasks and jobs
Monitoring View quality metrics and performance analytics
API Reference Quality API documentation