High-Level Architecture
CVAT follows a microservices architecture with these main components:Core Components
Frontend (cvat-ui)
Technology Stack:- React 18
- TypeScript
- Redux for state management
- Ant Design for UI components
- Webpack for bundling
cvat-ui
Main UI application providing:- Task and project management interfaces
- Annotation workspace
- User management
- Analytics dashboards
- Settings and configuration
cvat-ui/src/
cvat-core
Core business logic and API client:- REST API communication
- Data models (Task, Job, Project, Annotation)
- State management
- Authentication handling
cvat-core/src/
cvat-canvas
2D annotation canvas:- Drawing tools (rectangle, polygon, polyline, points, ellipse, mask)
- Interaction handling (drag, resize, rotate)
- Zoom and pan
- SVG-based rendering
cvat-canvas/src/
cvat-canvas3d
3D annotation canvas:- Point cloud rendering
- 3D cuboid annotations
- Camera views (perspective, top, side, front)
- Three.js-based rendering
cvat-canvas3d/src/
cvat-data
Data handling utilities:- Frame providers
- Data chunking
- Video/image processing
- Compression utilities
cvat-data/src/
Backend (cvat)
Technology Stack:- Python 3.10+
- Django 4.2
- Django REST Framework
- Uvicorn (ASGI server)
- RQ (Redis Queue) for background tasks
cvat/apps/:
engine
Core business logic:- Models: Task, Job, Project, Label, Annotation
- API endpoints for CRUD operations
- Annotation management
- Frame caching and serving
cvat/apps/engine/
dataset_manager
Import/export functionality:- Format converters (COCO, YOLO, Pascal VOC, etc.)
- Annotation transformations
- Dataset validation
cvat/apps/dataset_manager/
organizations
Multi-tenancy support:- Organization management
- Membership handling
- Resource isolation
cvat/apps/organizations/
iam (Identity and Access Management)
Authentication and authorization:- User management
- Permission system
- Role-based access control
- Integration with OPA
cvat/apps/iam/
quality_control
Annotation quality features:- Quality reports
- Conflict detection
- Inter-annotator agreement
- Honeypot frames
cvat/apps/quality_control/
consensus
Consensus annotations:- Replica job management
- Annotation merging
- Agreement calculation
cvat/apps/consensus/
lambda_manager
Serverless function integration:- Auto-annotation with AI models
- Function management
- Request handling
cvat/apps/lambda_manager/
webhooks
Webhook system:- Event notifications
- Webhook configuration
- Delivery management
cvat/apps/webhooks/
events
Event tracking and analytics:- User action logging
- Event aggregation
- Analytics data export
cvat/apps/events/
Background Workers (RQ)
CVAT uses Redis Queue for asynchronous task processing:cvat_worker_import
Handles annotation imports:- Parses uploaded files
- Validates annotations
- Inserts data into database
cvat_worker_export
Handles dataset exports:- Converts annotations to target format
- Packages data
- Generates download archives
cvat_worker_annotation
Handles auto-annotation:- Calls serverless functions
- Processes model predictions
- Creates annotations from results
cvat_worker_quality_reports
Processes quality reports:- Compares annotations
- Calculates metrics
- Generates reports
cvat_worker_consensus
Handles consensus merging:- Aggregates replica annotations
- Calculates agreement scores
- Merges annotations
Data Storage
PostgreSQL
Purpose: Primary relational database Stores:- User accounts and permissions
- Tasks, jobs, and projects
- Labels and annotations
- Organizations and memberships
- Audit logs
docker-compose.yml
Redis (In-Memory)
Purpose: Caching and RQ job queue Stores:- Session data
- Cached API responses
- RQ job queue
- Temporary data
cvat_redis_inmem service
Kvrocks (On-Disk)
Purpose: Persistent cache for media data Stores:- Compressed image chunks
- Video frames
- Cached media files
cvat_redis_ondisk service
ClickHouse
Purpose: Analytics and event storage Stores:- User events
- Action logs
- Performance metrics
- Analytics aggregations
cvat_clickhouse service
Supporting Services
Traefik
Purpose: Reverse proxy and load balancer Features:- HTTPS termination
- Request routing
- Access logging
- Rate limiting
OPA (Open Policy Agent)
Purpose: Policy-based authorization Features:- Fine-grained access control
- Policy evaluation
- Rule-based permissions
cvat/apps/iam/rules/
Vector
Purpose: Log aggregation and forwarding Features:- Collects container logs
- Transforms log data
- Forwards to analytics systems
Data Flow
Annotation Creation Flow
- User draws annotation in UI (cvat-canvas)
- Canvas dispatches Redux action
- cvat-core serializes annotation
- API request sent to backend
- Django view validates data
- OPA checks permissions
- Data saved to PostgreSQL
- Event logged to ClickHouse
- Response returned to frontend
- Redux state updated
- UI re-renders
Task Creation Flow
- User submits task form
- Files uploaded via TUS protocol
- Backend creates Task record
- Job creation queued in RQ
- Worker processes media:
- Extracts frames from video
- Generates thumbnails
- Creates chunks
- Compresses and caches
- Jobs created and assigned
- Task marked as ready
- User notified
Export Flow
- User requests dataset export
- Backend queues export job
- Worker fetches annotations
- Format converter transforms data
- Files packaged into archive
- Archive stored temporarily
- Download link provided
- User downloads file
API Design
CVAT uses RESTful API design:Endpoints Structure
Authentication
- Session authentication: For browser-based access
- Token authentication: For API/SDK access (being deprecated)
- Access tokens: New token system with expiration
Frontend Architecture
State Management
CVAT uses Redux with this structure:Component Structure
Security
Authentication Flow
- User logs in with credentials
- Django creates session
- Session ID stored in cookie
- Cookie sent with each request
- Backend validates session
- OPA evaluates permissions
- Request allowed or denied
Authorization (OPA)
OPA policies define permissions:Scalability
CVAT can scale horizontally:- Frontend: Served statically, can use CDN
- Backend: Multiple server instances behind load balancer
- Workers: Scale worker count based on queue depth
- Database: PostgreSQL replication and read replicas
- Cache: Redis Cluster or Kvrocks sharding
Development Workflow
- Clone repository
- Start development environment
- Make changes to code
- Frontend: Hot-reload automatically
- Backend: Restart container or use debugger
- Run tests
- Submit pull request
Deployment
CVAT supports multiple deployment options:- Docker Compose: Simple single-server deployment
- Kubernetes/Helm: Production-grade orchestration
- Cloud: AWS, GCP, Azure with managed services
Next Steps
- Review the code style guidelines
- Learn about testing
- Read the pull request guide
- Explore the API documentation