Dataset Overview
The KITTI dataset provides:- Stereo camera setup (rectified grayscale, 10 Hz)
- Large-scale outdoor environments (urban, residential, highway)
- Ground truth from GPS/IMU (for some sequences)
- 22 stereo sequences for odometry benchmark
- Challenging conditions (moving objects, lighting changes)
Key Characteristics
Autonomous Driving
Real-world driving scenarios with moving vehicles and pedestrians.
Large Scale
Long sequences covering several kilometers.
Rectified Stereo
Pre-rectified images ready for stereo processing.
GPS Ground Truth
Accurate position data for evaluation.
Download Instructions
Visit KITTI website
Download sequences
You’ll need:
- Left camera images (grayscale)
- Right camera images (grayscale) - for stereo
- Ground truth poses - for evaluation
- Calibration files
Running Monocular Examples
Process KITTI sequences using only the left camera:Code Structure
The monocular KITTI example (mono_kitti.cc:108) reads images sequentially:Sequence-Specific Configurations
KITTI sequences have different calibrations:- Sequences 00-02
- Sequences 03
- Sequences 04-12
Different KITTI sequences use different camera calibrations. Always use the appropriate YAML file.
Running Stereo Examples
Process both left and right cameras for accurate depth estimation:Why Stereo for KITTI?
Accurate Scale
Stereo provides metric scale without drift, critical for driving applications.
Better Initialization
Instant depth from first frame enables faster startup.
Robust Tracking
Depth constraints improve tracking in feature-poor areas (sky, roads).
Direct Comparison
Easier evaluation against GPS/IMU ground truth.
Image Loading
The stereo example loads pre-rectified image pairs:Typical Use Cases
KITTI sequences represent different driving scenarios:Urban Sequences
- Sequence 00
- Sequence 05
Environment: Urban neighborhood
- Length: 4541 frames (~7 minutes)
- Features: Buildings, parked cars, trees
- Difficulty: Moderate
Highway Sequences
- Sequence 01
- Sequence 02
Environment: Highway
- Length: 1101 frames
- Features: Guard rails, distant background
- Difficulty: Moderate (high speed)
Challenging Sequences
- Sequence 08
- Sequence 10
Environment: Residential
- Length: 4071 frames
- Features: Complex loops, similar structures
- Difficulty: High (requires loop closure)
Output and Evaluation
Trajectory Format
KITTI examples save trajectories in TUM format by default (mono_kitti.cc:151):Converting to KITTI Format
For submission to KITTI benchmark, convert to KITTI pose format:Evaluation with Ground Truth
- Translation error (%)
- Rotation error (deg/m)
- Success rate
Performance Considerations
Speed vs Accuracy
- Real-time Processing
- High Accuracy
For 10 Hz real-time processing:
Hardware Requirements
KITTI sequences are long and computationally demanding. Recommended hardware:
- CPU: Intel i7 or better
- RAM: 8GB minimum
- Storage: 100GB for full dataset
Configuration Files
KITTI uses pinhole camera parameters:Troubleshooting
Scale Drift in Monocular
Scale Drift in Monocular
Monocular SLAM has no absolute scale:
- Scale drift is expected over long sequences
- Use stereo mode for metric scale
- Loop closures help reduce drift
Tracking Lost on Highway
Tracking Lost on Highway
Highways have fewer features:
- Increase
ORBextractor.nFeatures - Use stereo mode for depth constraints
- Guard rails and lane markings provide tracking
Moving Object Interference
Moving Object Interference
Cars and pedestrians can affect SLAM:
- ORB-SLAM3 is designed to handle outliers
- Most moving objects are automatically rejected
- Some sequences may have temporary tracking issues
Loop Closure Not Working
Loop Closure Not Working
For sequences with loops (e.g., 08):
- Ensure vocabulary loaded correctly
- Allow sufficient processing time
- Check that place recognition is enabled
Comparing Results
Expected Performance
Typical results on KITTI sequences:| Sequence | Length | Translation Error | Rotation Error |
|---|---|---|---|
| 00 | 3.7 km | 0.75% | 0.003 deg/m |
| 01 | 1.0 km | 1.2% | 0.004 deg/m |
| 02 | 5.1 km | 0.9% | 0.003 deg/m |
| 05 | 2.2 km | 0.8% | 0.003 deg/m |
Results shown are for stereo mode. Monocular mode typically has higher drift.
Advanced: Custom KITTI Data
To use your own KITTI-format data:Calibrate camera
Create a custom YAML file with your camera parameters.
See Camera Configuration.
Next Steps
EuRoC Dataset
Try indoor sequences with IMU
Camera Calibration
Calibrate your own stereo rig
Loop Closure
Understanding place recognition
Custom Datasets
Process your own sequences