Skip to main content
TFOD (TensorFlow Object Detection) CSV is a simple format commonly used with TensorFlow’s Object Detection API. Panlabel supports reading and writing TFOD CSV files.

Overview

  • Path type: CSV file (.csv)
  • Lossiness: Lossy (see below)
  • Bbox format: Normalized XYXY [xmin, ymin, xmax, ymax] (0..1)
  • Use case: TensorFlow training, simple CSV-based workflows

CSV Structure

TFOD CSV uses a simple tabular format with 8 columns:
filename,width,height,class,xmin,ymin,xmax,ymax
image001.jpg,640,480,person,0.1562,0.3125,0.4687,0.8333
image001.jpg,640,480,car,0.4687,0.2083,0.9375,0.8333
image002.jpg,800,600,dog,0.2000,0.3000,0.6000,0.9000

Columns

ColumnTypeDescription
filenamestringImage filename
widthintegerImage width in pixels
heightintegerImage height in pixels
classstringCategory/class name
xminfloatLeft edge (normalized 0..1)
yminfloatTop edge (normalized 0..1)
xmaxfloatRight edge (normalized 0..1)
ymaxfloatBottom edge (normalized 0..1)

Bounding Box Format

TFOD uses normalized XYXY coordinates (0.0 to 1.0):
xmin,ymin,xmax,ymax
0.15625,0.3125,0.46875,0.8333
Panlabel converts to/from pixel-space XYXY.

Conversion Example

TFOD CSV row:
image001.jpg,640,480,person,0.1,0.2,0.5,0.8
Conversion to pixel XYXY:
  1. xmin = 0.1 × 640 = 64.0
  2. ymin = 0.2 × 480 = 96.0
  3. xmax = 0.5 × 640 = 320.0
  4. ymax = 0.8 × 480 = 384.0
IR bbox: [64.0, 96.0, 320.0, 384.0]

Reader Behavior

Reading Process

  1. Parse CSV file with header row
  2. Validate consistent dimensions per filename
  3. Build image map (filename → width, height)
  4. Build category map (class name → CategoryId)
  5. Assign deterministic IDs:
    • Image IDs: by filename (lexicographic)
    • Category IDs: by class name (lexicographic)
    • Annotation IDs: by CSV row order

Dimension Validation

Each filename must have consistent dimensions across all rows:
image.jpg,640,480,cat,0.1,0.1,0.5,0.5   ✓
image.jpg,640,480,dog,0.2,0.2,0.6,0.6   ✓
image.jpg,800,600,bird,0.3,0.3,0.7,0.7  ✗ ERROR
Error:
Inconsistent dimensions for 'image.jpg': (640, 480) vs (800, 600)

Writer Behavior

Writing Process

  1. Validate all annotation references (images, categories)
  2. Build lookup maps
  3. Convert pixel bboxes to normalized coordinates
  4. Sort rows by annotation ID (deterministic output)
  5. Write CSV with header

Coordinate Precision

Normalized floats written with full precision:
image001.jpg,640,480,person,0.15625,0.3125,0.46875,0.833333

Lossiness

TFOD CSV is very lossy. Only basic annotation data is preserved.

Preserved ✓

  • Image filenames and dimensions
  • Category names
  • Bounding box coordinates (normalized ↔ pixel conversion)

Not Preserved ✗

  • Dataset-level metadata/licenses
  • Image-level license/date metadata
  • Annotation confidence/attributes
  • Category supercategory
  • Custom attributes
  • Images without annotations
Images without annotations are not represented in TFOD CSV output. This is a fundamental limitation of the format.

Limitations

No Images Without Annotations

TFOD CSV cannot represent images without bounding boxes: Input IR:
{
  "images": [{"id": 1, "file_name": "empty.jpg", ...}],
  "annotations": []
}
Output CSV:
filename,width,height,class,xmin,ymin,xmax,ymax
(Empty except header - image is lost)

No Metadata

No support for:
  • Dataset info
  • Licenses
  • Image dates
  • Annotation confidence
  • Custom attributes

Row-Based Only

Each row represents one annotation. Cannot store image-level or dataset-level data.

Usage

Read TFOD CSV

panlabel convert annotations.csv output.json --input-format tfod --output-format ir-json
Aliases: tfod-csv

Write TFOD CSV

panlabel convert input.json output.csv --input-format ir-json --output-format tfod

Deterministic Output

Rows are sorted by annotation ID for stable diffs:
filename,width,height,class,xmin,ymin,xmax,ymax
img1.jpg,640,480,person,0.1,0.2,0.5,0.8    # annotation ID 1
img1.jpg,640,480,car,0.3,0.1,0.7,0.4       # annotation ID 2
img2.jpg,800,600,dog,0.2,0.3,0.6,0.9       # annotation ID 3

Example Dataset

Complete TFOD CSV example:
filename,width,height,class,xmin,ymin,xmax,ymax
img1.jpg,640,480,person,0.15625,0.3125,0.46875,0.8333
img1.jpg,640,480,person,0.53125,0.2083,0.78125,0.7083
img1.jpg,640,480,car,0.46875,0.2083,0.9375,0.8333
img2.jpg,800,600,dog,0.2,0.3,0.6,0.9
img2.jpg,800,600,cat,0.1,0.1,0.3,0.4
img3.jpg,1024,768,bicycle,0.4,0.5,0.7,0.8
Dataset summary:
  • 3 images (img1.jpg, img2.jpg, img3.jpg)
  • 3 categories (person, car, dog, cat, bicycle)
  • 6 annotations total

See Also

COCO Format

More feature-rich JSON format

Format Overview

Compare all supported formats

Build docs developers (and LLMs) love