FiftyOne Integration

Overview

FiftyOne is an open-source tool for dataset visualization, exploration, and curation that integrates seamlessly with CVAT. This integration creates a powerful workflow for managing computer vision datasets, combining FiftyOne’s advanced analytics with CVAT’s annotation capabilities.

The FiftyOne integration is available for both CVAT Cloud and self-hosted installations.

What is FiftyOne?

FiftyOne is an open-source dataset curation and model analysis tool that provides:

Visual dataset exploration: Interactive browser-based dataset visualization
Dataset quality analysis: Identify issues, outliers, and edge cases
Model evaluation: Analyze model predictions and errors
Label refinement: Send samples to CVAT for annotation or correction
Embeddings visualization: Understand dataset structure and diversity

Prerequisites

Python 3.7 or higher
FiftyOne installed (pip install fiftyone)
CVAT account (Cloud or self-hosted)
CVAT API credentials

Installation

Install FiftyOne with CVAT integration support:

# Install FiftyOne
pip install fiftyone

# Install CVAT SDK (required for integration)
pip install cvat-sdk

Verify the installation:

import fiftyone as fo
import fiftyone.zoo as foz

print(fo.__version__)

Connecting FiftyOne to CVAT

Configure FiftyOne to connect to your CVAT instance:

For CVAT Cloud

import fiftyone as fo
from fiftyone.utils.cvat import CVATBackendConfig

# Configure CVAT connection
config = CVATBackendConfig(
    url="https://app.cvat.ai",
    username="your-username",
    password="your-password"
)

For Self-Hosted CVAT

config = CVATBackendConfig(
    url="https://your-cvat-instance.com",
    username="your-username",
    password="your-password"
)

Never hardcode credentials in your scripts. Use environment variables or a secure configuration file.

Using Environment Variables

export FIFTYONE_CVAT_URL="https://app.cvat.ai"
export FIFTYONE_CVAT_USERNAME="your-username"
export FIFTYONE_CVAT_PASSWORD="your-password"

Then in Python:

import os
import fiftyone as fo
from fiftyone.utils.cvat import CVATBackendConfig

config = CVATBackendConfig(
    url=os.getenv("FIFTYONE_CVAT_URL"),
    username=os.getenv("FIFTYONE_CVAT_USERNAME"),
    password=os.getenv("FIFTYONE_CVAT_PASSWORD")
)

Workflow: FiftyOne to CVAT

1. Load and Explore Dataset in FiftyOne

Start by loading a dataset into FiftyOne:

import fiftyone as fo
import fiftyone.zoo as foz

# Load a dataset (example using COCO)
dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    max_samples=100
)

# Launch FiftyOne App to explore
session = fo.launch_app(dataset)

2. Select Samples for Annotation

Use FiftyOne’s query capabilities to select samples:

# Select samples that need annotation
from fiftyone import ViewField as F

# Example: Select images without annotations
view = dataset.match(F("ground_truth.detections").length() == 0)

# Example: Select images with low confidence predictions
view = dataset.match(
    F("predictions.detections.confidence").max() < 0.7
)

# Example: Random sample for quality control
view = dataset.take(50)

3. Send Samples to CVAT

Export selected samples to CVAT for annotation:

import fiftyone.utils.cvat as fouc

# Define label schema
label_schema = {
    "ground_truth": {
        "type": "detections",
        "classes": ["person", "car", "bicycle", "dog", "cat"]
    }
}

# Upload to CVAT
results = view.annotate(
    "cvat",
    label_schema=label_schema,
    label_field="ground_truth",
    task_name="Dataset Annotation - Batch 1",
    task_size=10,  # Samples per task
    segment_size=1,  # Images per job
    backend_config=config
)

print(f"Created CVAT task: {results.task_id}")

4. Annotate in CVAT

Annotators can now work on the task in CVAT using all available features:

Manual annotation tools
Automatic annotation with AI models
Quality control and review
Collaborative annotation

5. Import Annotations Back to FiftyOne

Once annotation is complete, import the results:

# Load annotations from CVAT
results.load_annotations()

print(f"Loaded {len(view)} annotated samples")

# Refresh the FiftyOne App to see updates
session.refresh()

Workflow: CVAT to FiftyOne

You can also import existing CVAT projects into FiftyOne:

Import CVAT Project

import fiftyone as fo
from fiftyone.utils.cvat import CVATBackendConfig, import_annotations

# Configure connection
config = CVATBackendConfig(
    url="https://app.cvat.ai",
    username="your-username",
    password="your-password"
)

# Create a FiftyOne dataset from CVAT task
task_id = 12345

dataset = fo.Dataset.from_dir(
    dataset_type=fo.types.CVATImageDataset,
    data_path="/path/to/images",
    labels_path=f"cvat://task/{task_id}",
    backend=config
)

print(dataset)

Download CVAT Annotations

# Download annotations for offline analysis
from cvat_sdk import make_client

client = make_client(
    host="https://app.cvat.ai",
    credentials=("username", "password")
)

# Download task annotations
task = client.tasks.retrieve(12345)
task.export_dataset("COCO 1.0", "annotations.zip")

# Load into FiftyOne
dataset = fo.Dataset.from_dir(
    dataset_type=fo.types.COCODetectionDataset,
    data_path="images/",
    labels_path="annotations.json"
)

Advanced Use Cases

Dataset Quality Control

Use FiftyOne to identify annotation quality issues:

import fiftyone as fo
import fiftyone.brain as fob

# Load annotated dataset
dataset = fo.load_dataset("my_cvat_dataset")

# Compute uniqueness (find duplicates)
fob.compute_uniqueness(dataset)

# Find potential duplicates
duplicates_view = dataset.sort_by("uniqueness").limit(100)

# Visualize
session = fo.launch_app(duplicates_view)

# Send duplicates back to CVAT for review
duplicates_view.annotate(
    "cvat",
    label_field="ground_truth",
    task_name="Quality Control - Duplicates",
    backend_config=config
)

Active Learning Pipeline

Implement an active learning workflow:

import fiftyone as fo
import fiftyone.brain as fob

# 1. Train model on initial dataset
# (model training code here)

# 2. Run inference on unlabeled data
dataset.apply_model(model, label_field="predictions")

# 3. Compute hardness scores
fob.compute_hardness(dataset, "predictions")

# 4. Select hard examples for annotation
hard_samples = dataset.sort_by("hardness", reverse=True).limit(100)

# 5. Send to CVAT for labeling
hard_samples.annotate(
    "cvat",
    label_field="ground_truth",
    task_name="Active Learning - Round 1",
    backend_config=config
)

# 6. Import labels and retrain
# (repeat the cycle)

import fiftyone as fo
import fiftyone.brain as fob
from fiftyone import ViewField as F

# Load predictions and ground truth
dataset = fo.load_dataset("model_evaluation")

# Compute evaluation metrics
results = dataset.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval"
)

# Find false positives
fp_view = dataset.match(
    F("eval_fp") > 0
)

# Send false positives to CVAT for label verification
fp_view.annotate(
    "cvat",
    label_field="ground_truth",
    task_name="False Positive Review",
    backend_config=config
)

print(f"Sent {len(fp_view)} false positives for review")

Best Practices

Organize Your Workflow

Explore first: Use FiftyOne to understand your data before annotating
Strategic sampling: Annotate the most valuable samples first
Batch processing: Break large datasets into manageable CVAT tasks
Regular syncing: Import annotations frequently to track progress

Optimize Task Creation

Task size: 50-200 images per task works well
Job segments: 10-30 images per job for efficient annotation
Label consistency: Use the same label schema across all tasks
Clear naming: Use descriptive task names with dates/batches

Quality Assurance

Use FiftyOne to visualize annotations after import
Compare multiple annotator outputs
Identify and resolve label inconsistencies
Track annotation progress with metadata

Troubleshooting

Connection Issues

Problem: Cannot connect to CVAT from FiftyOne Solution:

# Test connection
from cvat_sdk import make_client

client = make_client(
    host="https://app.cvat.ai",
    credentials=("username", "password")
)
print(client.api_client.configuration.host)

Label Schema Mismatch

Problem: Labels don’t match between FiftyOne and CVAT Solution: Explicitly define label mappings:

label_mapping = {
    "fiftyone_label": "cvat_label"
}

Large Dataset Performance

For large datasets:

Use dataset views to work with subsets
Enable sample caching in FiftyOne
Break into multiple smaller CVAT tasks

Get Started

Annotation

Projects & Tasks

Dataset Management

Integrations

Account & Organization

FiftyOne Integration

Overview

What is FiftyOne?

Prerequisites

Installation

Connecting FiftyOne to CVAT

For CVAT Cloud

For Self-Hosted CVAT

Using Environment Variables

Workflow: FiftyOne to CVAT

1. Load and Explore Dataset in FiftyOne

2. Select Samples for Annotation

3. Send Samples to CVAT

4. Annotate in CVAT

5. Import Annotations Back to FiftyOne

Workflow: CVAT to FiftyOne

Import CVAT Project

Download CVAT Annotations

Advanced Use Cases

Dataset Quality Control

Active Learning Pipeline

Model Evaluation with CVAT Refinement

Best Practices

Troubleshooting

Connection Issues

Label Schema Mismatch

Large Dataset Performance

Additional Resources

Build docs developers (and LLMs) love

Get Started

Annotation

Projects & Tasks

Dataset Management

Integrations

Account & Organization

​Overview

​What is FiftyOne?

​Prerequisites

​Installation

​Connecting FiftyOne to CVAT

​For CVAT Cloud

​For Self-Hosted CVAT

​Using Environment Variables

​Workflow: FiftyOne to CVAT

​1. Load and Explore Dataset in FiftyOne

​2. Select Samples for Annotation

​3. Send Samples to CVAT

​4. Annotate in CVAT

​5. Import Annotations Back to FiftyOne

​Workflow: CVAT to FiftyOne

​Import CVAT Project

​Download CVAT Annotations

​Advanced Use Cases

​Dataset Quality Control

​Active Learning Pipeline

​Model Evaluation with CVAT Refinement

​Best Practices

​Troubleshooting

​Connection Issues

​Label Schema Mismatch

​Large Dataset Performance

​Additional Resources

Build docs developers (and LLMs) love

Overview

What is FiftyOne?

Prerequisites

Installation

Connecting FiftyOne to CVAT

For CVAT Cloud

For Self-Hosted CVAT

Using Environment Variables

Workflow: FiftyOne to CVAT

1. Load and Explore Dataset in FiftyOne

2. Select Samples for Annotation

3. Send Samples to CVAT

4. Annotate in CVAT

5. Import Annotations Back to FiftyOne

Workflow: CVAT to FiftyOne

Import CVAT Project

Download CVAT Annotations

Advanced Use Cases

Dataset Quality Control

Active Learning Pipeline

Model Evaluation with CVAT Refinement

Best Practices

Troubleshooting

Connection Issues

Label Schema Mismatch

Large Dataset Performance

Additional Resources