Quickstart

This guide will walk you through the most common Panlabel workflows. You’ll learn how to convert datasets, validate them, generate statistics, and more.

Make sure you have installed Panlabel before following this guide.

Basic Conversion

Convert COCO to YOLO

The most common use case — converting COCO annotations to YOLO format:

panlabel convert --from auto --to yolo -i annotations.json -o ./yolo_out --allow-lossy

The --from auto flag automatically detects the input format. The --allow-lossy flag is required when converting to formats that don’t preserve all data.

Output:

✓ Detected format: coco
✓ Loaded 1000 images, 5000 annotations
⚠ Conversion is lossy:
  - Segmentation masks will be dropped
  - Image metadata will be lost
✓ Wrote YOLO dataset to ./yolo_out

Convert YOLO to COCO

Going the other direction:

panlabel convert -f yolo -t coco -i ./my_dataset -o coco_output.json

Output:

✓ Loaded YOLO dataset from ./my_dataset
✓ Found 800 images, 3500 annotations
✓ Wrote COCO JSON to coco_output.json

Try other format conversions

Panlabel supports many formats:

panlabel convert -f voc -t coco -i ./voc_dataset -o coco_output.json

Validate Your Dataset

Check for common problems

Before training a model, validate your dataset:

panlabel validate --format coco annotations.json

Output:

✓ Validating COCO dataset...
✓ Found 1000 images, 5000 annotations, 10 categories

Issues found:
⚠ Warning: 3 images have no annotations
⚠ Warning: 2 annotations have zero area
✗ Error: Duplicate annotation ID: 42
✗ Error: Annotation 123 references non-existent image ID: 999

Summary: 2 errors, 5 warnings

Use validation before training to catch data issues early. Many training failures are due to malformed annotations.

Get Dataset Statistics

Generate a statistical overview

Understand your dataset composition:

panlabel stats --format coco annotations.json

Output:

Dataset Statistics
==================

Images: 1,000
Annotations: 5,000
Categories: 10

Annotations per image:
  Mean: 5.0
  Median: 4.0
  Min: 0
  Max: 25

Category distribution:
  person: 2,000 (40%)
  car: 1,500 (30%)
  dog: 800 (16%)
  cat: 700 (14%)
  ...

Bounding box sizes:
  Mean area: 12,500 px²
  Median area: 8,000 px²

Export as JSON or HTML

Get machine-readable stats or a visual report:

panlabel stats --format coco annotations.json --output json > stats.json

Compare Datasets

Find differences between datasets

Compare two versions of your dataset:

panlabel diff --format-a auto --format-b auto old.json new.json

Output:

Dataset Comparison
==================

Images:
  Added: 50
  Removed: 10
  Modified: 5

Annotations:
  Added: 200
  Removed: 30
  Modified: 15

Categories:
  Added: person, car
  Removed: bicycle

Sample a Subset

Create a smaller dataset for testing

Extract a random subset of your data:

panlabel sample -i annotations.json -o sample.ir.json --from auto --to ir-json -n 100 --seed 42

This creates a dataset with exactly 100 images, randomly selected with a fixed seed for reproducibility.

Use stratified sampling

Maintain category distribution in your sample:

panlabel sample -i annotations.json -o sample.json --from auto --to coco -n 100 --strategy stratified

Stratified sampling ensures that rare categories are represented proportionally in your subset.

List Supported Formats

View format capabilities

See which formats Panlabel supports:

panlabel list-formats

Output:

Supported Formats
=================

ir-json
  Read: ✓  Write: ✓  Lossless: ✓
  Description: Panlabel's intermediate representation

coco
  Read: ✓  Write: ✓  Lossless: Conditional
  Description: COCO object detection format

yolo
  Read: ✓  Write: ✓  Lossless: ✗
  Description: Ultralytics YOLO format

...

Common Workflows

Quality Control Pipeline:

Validate your dataset: panlabel validate --format coco data.json
Check statistics: panlabel stats --format coco data.json
Create a test subset: panlabel sample -i data.json -o test.json -n 100
Convert to training format: panlabel convert -f coco -t yolo -i data.json -o ./yolo_train

Experiment with Different Formats:

Convert to IR JSON for lossless storage: panlabel convert -f coco -t ir-json -i data.json -o data.ir.json
Generate multiple format versions from IR:
- panlabel convert -f ir-json -t yolo -i data.ir.json -o ./yolo --allow-lossy
- panlabel convert -f ir-json -t voc -i data.ir.json -o ./voc --allow-lossy

Version Control Your Datasets:

Export baseline: panlabel convert -f auto -t ir-json -i baseline.json -o v1.ir.json
Make changes and export: panlabel convert -f auto -t ir-json -i updated.json -o v2.ir.json
Compare versions: panlabel diff --format-a ir-json --format-b ir-json v1.ir.json v2.ir.json

Next Steps

CLI Reference

Explore all available commands and options

Format Reference

Learn about format-specific details and limitations

Conversion & Lossiness

Understand what data gets lost in conversions

Contributing

Learn how to contribute to Panlabel

Get Started

CLI Commands

Guides

Format Reference

Advanced

Basic Conversion

Validate Your Dataset

Get Dataset Statistics

Compare Datasets

Sample a Subset

List Supported Formats

Common Workflows

Next Steps

CLI Reference

Format Reference

Conversion & Lossiness

Contributing

Build docs developers (and LLMs) love

Get Started

CLI Commands

Guides

Format Reference

Advanced

​Basic Conversion

​Validate Your Dataset

​Get Dataset Statistics

​Compare Datasets

​Sample a Subset

​List Supported Formats

​Common Workflows

​Next Steps

CLI Reference

Format Reference

Conversion & Lossiness

Contributing

Build docs developers (and LLMs) love

Basic Conversion

Validate Your Dataset

Get Dataset Statistics

Compare Datasets

Sample a Subset

List Supported Formats

Common Workflows

Next Steps