Pascal VOC Format

Pascal VOC (Visual Object Classes) is a classic XML-based format for object detection. Panlabel supports reading and writing VOC datasets.

Overview

Path type: Directory with Annotations/ and optional JPEGImages/
Lossiness: Lossy (see below)
Bbox format: Pixel-space XYXY [xmin, ymin, xmax, ymax]
Use case: Legacy datasets, academic benchmarks

Directory Structure

dataset/
├── Annotations/
│   ├── img1.xml
│   ├── img2.xml
│   └── train/
│       └── img3.xml
└── JPEGImages/
    ├── img1.jpg
    ├── img2.jpg
    └── train/
        └── img3.jpg

Key Components

Annotations/: XML files, one per image (required)
JPEGImages/: Image files (optional, not read by Panlabel)

XML Structure

<?xml version="1.0" encoding="utf-8"?>
<annotation>
  <folder>JPEGImages</folder>
  <filename>img1.jpg</filename>
  <size>
    <width>640</width>
    <height>480</height>
    <depth>3</depth>
  </size>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <occluded>0</occluded>
    <bndbox>
      <xmin>100</xmin>
      <ymin>150</ymin>
      <xmax>300</xmax>
      <ymax>400</ymax>
    </bndbox>
  </object>
  <object>
    <name>car</name>
    <bndbox>
      <xmin>350</xmin>
      <ymin>200</ymin>
      <xmax>600</xmax>
      <ymax>450</ymax>
    </bndbox>
  </object>
</annotation>

Bounding Box Format

VOC uses pixel-space XYXY coordinates (same as IR):

<bndbox>
  <xmin>100</xmin>
  <ymin>150</ymin>
  <xmax>300</xmax>
  <ymax>400</ymax>
</bndbox>

xmin: Left edge in pixels
ymin: Top edge in pixels
xmax: Right edge in pixels
ymax: Bottom edge in pixels

No coordinate conversion needed (already XYXY).

Object Attributes

VOC supports several object-level attributes:

pose: Object pose (e.g., “Frontal”, “Left”, “Unspecified”)
truncated: 1 if object is cut off at image boundary, 0 otherwise
difficult: 1 if object is hard to recognize, 0 otherwise
occluded: 1 if object is occluded, 0 otherwise (non-standard but supported)

Panlabel stores these as annotation attributes.

Attribute Mapping

Reading:

<truncated>1</truncated>  →  IR attribute: {"truncated": "1"}
<difficult>0</difficult>  →  IR attribute: {"difficult": "0"}
<occluded>yes</occluded>  →  IR attribute: {"occluded": "yes"}
<pose>Left</pose>         →  IR attribute: {"pose": "Left"}

Writing:

Retrieves attributes from IR annotation
Normalizes boolean values:
- true/yes/1 → 1
- false/no/0 → 0
- Other values → omitted

Reader Behavior

Input Path

Accepts:

Dataset root containing Annotations/
Annotations/ directory directly

Reading Process

Discover layout (find Annotations/ directory)
Scan Annotations/ flat only (non-recursive)
Parse each XML file:
- Extract <filename>, <width>, <height>
- Extract <depth> (stored as image attribute)
- Parse all <object> elements
Assign deterministic IDs:
- Image IDs: by <filename> (lexicographic)
- Category IDs: by class name (lexicographic)
- Annotation IDs: by XML file order, then <object> order

Coordinate Policy

Reads xmin/ymin/xmax/ymax exactly as provided (no 0/1-based adjustment).

Nested XML Warning

Nested XML files (e.g., Annotations/train/img.xml) are skipped with a warning:

Warning: VOC reader scans Annotations/ flat (non-recursive); skipping 2 nested .xml file(s), e.g. train/img3.xml

Writer Behavior

Output Structure

output/
├── Annotations/
│   ├── img1.xml
│   └── train/
│       └── img3.xml
└── JPEGImages/
    └── README.txt

Writing Process

Create Annotations/ and JPEGImages/ directories
Write JPEGImages/README.txt placeholder
For each image:
- Create XML file at Annotations/<stem>.xml
- Preserve subdirectory structure from file_name
- Write all annotations sorted by annotation ID
Does not copy image binaries

Depth Attribute

Retrieves <depth> from image attribute "depth" if present:

<size>
  <width>640</width>
  <height>480</height>
  <depth>3</depth>
</size>

Boolean Normalization

Writes normalized boolean attributes:

true/yes/1  →  1
false/no/0  →  0
other       →  omitted

Empty Images

Writes XML files for images without annotations:

<?xml version="1.0" encoding="utf-8"?>
<annotation>
  <folder>JPEGImages</folder>
  <filename>img2.jpg</filename>
  <size>
    <width>800</width>
    <height>600</height>
  </size>
</annotation>

Lossiness

VOC format is lossy. Not preserved:

Dataset-level metadata/licenses
Image-level license/date metadata
Annotation confidence
Category supercategory
Custom attributes (except pose, truncated, difficult, occluded)

Preserved:

Image filenames and dimensions
Image depth (as attribute)
Category names
Bounding box coordinates (XYXY)
Standard object attributes (pose, truncated, difficult, occluded)

VOC does not store image binaries during conversion. You must manually copy images to the JPEGImages/ directory after writing.

Usage

Read VOC

panlabel convert dataset/ output.json --input-format voc --output-format ir-json

or from Annotations/ directly:

panlabel convert dataset/Annotations/ output.json --input-format voc --output-format ir-json

Write VOC

panlabel convert input.json voc-output/ --input-format ir-json --output-format voc

Then manually copy images:

cp -r original/JPEGImages/* voc-output/JPEGImages/

Subdirectory Structure

VOC preserves subdirectory structure in output: Input IR:

{"file_name": "train/img1.jpg", ...}

Output:

Annotations/train/img1.xml

YOLO Format

Another directory-based format

Format Overview

Compare all supported formats

Get Started

CLI Commands

Guides

Format Reference

Advanced

Pascal VOC Format

Overview

Directory Structure

Key Components

XML Structure

Bounding Box Format

Object Attributes

Attribute Mapping

Reader Behavior

Input Path

Reading Process

Coordinate Policy

Nested XML Warning

Writer Behavior

Output Structure

Writing Process

Depth Attribute

Boolean Normalization

Empty Images

Lossiness

Usage

Read VOC

Write VOC

Subdirectory Structure

See Also

YOLO Format

Format Overview

Build docs developers (and LLMs) love

Get Started

CLI Commands

Guides

Format Reference

Advanced

​Overview

​Directory Structure

​Key Components

​XML Structure

​Bounding Box Format

​Object Attributes

​Attribute Mapping

​Reader Behavior

​Input Path

​Reading Process

​Coordinate Policy

​Nested XML Warning

​Writer Behavior

​Output Structure

​Writing Process

​Depth Attribute

​Boolean Normalization

​Empty Images

​Lossiness

​Usage

​Read VOC

​Write VOC

​Subdirectory Structure

​See Also

YOLO Format

Format Overview

Build docs developers (and LLMs) love

Overview

Directory Structure

Key Components

XML Structure

Bounding Box Format

Object Attributes

Attribute Mapping

Reader Behavior

Input Path

Reading Process

Coordinate Policy

Nested XML Warning

Writer Behavior

Output Structure

Writing Process

Depth Attribute

Boolean Normalization

Empty Images

Lossiness

Usage

Read VOC

Write VOC

Subdirectory Structure

See Also