Overview
The Trash Classification AI System is built on a modular architecture with three core components that work together to detect, classify, and visualize waste materials in real-time video streams.Architecture Components
The system follows a pipeline architecture where each module has a specific responsibility:1. Segmentation Module
The segmentation module is responsible for detecting and tracking trash objects in video frames. Location:trash_classificator/segmentation/main.py
- YOLO-based segmentation model
- Hardware-accelerated device management (CUDA, MPS, or CPU)
- Object tracking with persistence
- Confidence-based filtering (threshold: 0.55)
2. Drawing Module
The drawing module visualizes detected trash objects with masks, bounding boxes, and tracking trails. Location:trash_classificator/drawing/main.py
- Three-layer visualization: masks, bounding boxes, and tracking trails
- Class-specific color coding
- 50-point tracking history for movement visualization
- Semi-transparent overlay (50% alpha blending)
3. Processing Module
The main processor orchestrates the entire pipeline, coordinating between segmentation and drawing. Location:trash_classificator/processor.py
Data Flow
The system processes each video frame through the following pipeline:Segmentation
YOLO model detects and segments trash objects with tracking IDs
- Returns: Results object with masks, boxes, classes, and tracking IDs
- Confidence threshold: 0.55
- Image size: 640x640
Visualization
Drawing module creates multi-layer visualization:
- Mask layer: Semi-transparent colored regions
- Bounding box layer: Labeled boxes with class names
- Tracking layer: Movement trails showing object paths
Module Interactions
Device Management
The system automatically selects the best available hardware:The system prioritizes GPU acceleration (MPS for Apple Silicon, CUDA for NVIDIA) and falls back to CPU if no GPU is available.
Model Loading
The YOLO model is loaded once during initialization and reused across frames:Performance Considerations
Image Copying
The system creates copies of frames before processing to preserve original data
Streaming Results
YOLO inference uses
stream=True for memory-efficient batch processingGPU Acceleration
Automatic device selection ensures optimal performance on available hardware
Persistent Tracking
Object IDs persist across frames for consistent tracking visualization
Design Patterns
The codebase follows several software engineering best practices:Interface Segregation
Interface Segregation
Each module defines an abstract interface (
ABC) that concrete implementations must follow:SegmentationModelInterfaceDrawingInterfaceMaskDrawerInterfaceBoundingBoxDrawerInterfaceTrackDrawerInterface
Dependency Injection
Dependency Injection
The
TrashClassificator receives initialized components, making it easy to swap implementations or mock components for testing.Single Responsibility
Single Responsibility
Each class has one clear purpose:
DeviceManager: Hardware selectionModelLoader: Model initializationSegmentationModel: InferenceMaskDrawer,BoundingBoxDrawer,TrackDrawer: Specific visualization tasks
This modular architecture makes it easy to extend the system with new visualization methods, different segmentation models, or additional processing steps.