What is SLAM?
Simultaneous Localization and Mapping (SLAM) is a computational problem where a robot or autonomous system must build a map of an unknown environment while simultaneously keeping track of its location within that environment. ORB-SLAM3 is a real-time SLAM library that performs Visual, Visual-Inertial, and Multi-Map SLAM with monocular, stereo, and RGB-D cameras.Core Architecture
ORB-SLAM3 consists of four main parallel threads that work together to achieve robust tracking and mapping:Tracking
Processes every frame to compute the camera pose by matching features with the local map. Decides when to insert keyframes.
Local Mapping
Manages the local map and performs local bundle adjustment to refine the camera poses and 3D point positions.
Loop Closing
Detects loops with every new keyframe and performs pose graph optimization when a loop is found.
Map Management
Handles multiple maps, map merging, and atlas management for seamless multi-session SLAM.
System Components
ORB Feature Extraction
ORB-SLAM3 uses ORB (Oriented FAST and Rotated BRIEF) features for visual tracking:- Scale invariance: Multi-scale pyramid for robust feature detection
- Rotation invariance: Oriented features handle camera rotation
- Computational efficiency: Real-time performance on standard CPUs
Place Recognition
The system uses DBoW2 (Bag of Binary Words) for:- Loop closure detection
- Relocalization after tracking loss
- Map matching for multi-map scenarios
Graph Optimization
ORB-SLAM3 leverages g2o for non-linear optimization:- Local bundle adjustment in mapping thread
- Pose graph optimization after loop closure
- Full bundle adjustment for map refinement
Thread Architecture
The Tracking thread runs in the main execution thread, while Local Mapping, Loop Closing, and Viewer run in separate threads for parallel processing.
Key Features
Visual-Only Tracking
Visual-Only Tracking
Robust monocular, stereo, and RGB-D SLAM without inertial sensors. Achieves accurate 6-DOF camera pose estimation using visual features alone.
Visual-Inertial Fusion
Visual-Inertial Fusion
Integrates IMU measurements at frame rate for improved robustness, especially during fast motion, occlusions, or texture-poor environments.
Multi-Map System
Multi-Map System
Creates and manages multiple maps in the same session. Automatically merges maps when common areas are detected.
Map Reuse
Map Reuse
Supports loading previously created maps and continuing SLAM in known environments.
Tracking States
The system can be in different tracking states managed through theSystem class:
- OK: Successfully tracking the current map
- LOST: Tracking lost, attempting relocalization
- RECENTLY_LOST: Tracking recently lost, using motion model
- NOT_INITIALIZED: System not yet initialized
Operating Modes
SLAM Mode (Default)
Localization-Only Mode
System Initialization
TheSystem class is the main entry point defined in include/System.h:83-265:
strVocFile: Path to ORB vocabulary for place recognitionstrSettingsFile: YAML configuration file with camera and sensor parameterssensor: Sensor mode (see Sensor Modes)bUseViewer: Enable/disable visualization
Performance Characteristics
Accuracy
Significantly more accurate than other open-source SLAM systems across all sensor configurations.
Robustness
As robust as the best systems available in the literature, handles challenging scenarios.
Real-time
Runs in real-time on modern CPUs (e.g., Intel i7) for live applications.
Related Concepts
Sensor Modes
Learn about different sensor configurations supported by ORB-SLAM3
Multi-Map System
Understand the Atlas and multi-map capabilities
Camera Models
Explore pinhole and fisheye camera models
IMU Integration
Deep dive into visual-inertial fusion
Next Steps
Choose Your Sensor Mode
Review the Sensor Modes page to understand which configuration matches your hardware.
Configure Your Camera
Set up your camera calibration file following the Camera Models guide.
Build the System
Follow the Building from Source guide to compile ORB-SLAM3.