stats module provides comprehensive dataset analysis, computing statistics about images, annotations, bounding boxes, and label distributions.
Main Function
stats_dataset
Compute a full statistics report for a dataset.
- Summary counts (images, categories, annotations)
- Label distribution and histogram
- Bounding box statistics (size, aspect ratio, area)
- Image resolution distribution
- Annotation density metrics
- Category co-occurrence patterns
dataset- The dataset to analyzeopts- Statistics options (top labels, tolerance, bar width)
StatsReport containing all computed statistics
Types
StatsOptions
Configuration for statistics computation.
top_labels- Number of top labels to show in the histogram. Default:10top_pairs- Number of top co-occurrence pairs to show. Default:10oob_tolerance_px- Tolerance in pixels for out-of-bounds checks. Default:0.5bar_width- Width of histogram bars in characters (for text output). Default:20
StatsReport
Comprehensive dataset statistics.
SummarySection
High-level dataset counts.
LabelsSection
Label distribution information.
BBoxStats
Bounding box quality metrics.
Example
Custom Options Example
HTML Report Generation
Related
- stats command - CLI interface for statistics
- Dataset type - IR Dataset structure