How recognition works
The recognition pipeline processes each camera frame:- Face detection: Detect faces in the frame using MLKit or Mediapipe
- Embedding generation: Convert each detected face into a vector embedding using FaceNet
- Vector search: Find the nearest neighbor in the database using cosine similarity
- Threshold check: If the distance is below a threshold, display the person’s name
- Spoof detection (optional): Check if the face is real or a spoof (photo, video, etc.)
Using the camera screen
The main camera screen is your primary interface for face recognition:Camera controls
- Switch camera: Tap the camera switch icon to toggle between front and rear cameras
- Face list: Tap the face icon to view and manage people in your database
Recognition display
When faces are detected:- Bounding boxes: Blue rounded rectangles appear around detected faces
- Name labels: The person’s name is displayed in the center of each bounding box
- “NOT RECOGNIZED”: Shown when a face doesn’t match anyone in the database
Performance metrics
At the bottom of the screen, you’ll see real-time performance metrics:Recognition pipeline implementation
TheFaceDetectionOverlay component handles the camera feed and recognition:
FaceDetectionOverlay.kt
Face embedding generation
The FaceNet model converts face images into embeddings:FaceNet.kt
Vector search with ObjectBox
FaceNet Android uses ObjectBox for efficient nearest-neighbor search:- HNSW indexing: Hierarchical Navigable Small World graphs enable fast approximate nearest-neighbor search
- Cosine similarity: Measures the similarity between face embeddings
- Flat search option: For precise (non-approximate) search, you can enable flat search in the configuration
The app re-computes cosine similarity between the query vector and the nearest neighbor to account for lossy compression in the vector database.
Recognition metrics
TheRecognitionMetrics data class tracks performance:
DataModels.kt
Spoof detection
The app includes anti-spoofing detection to identify if a detected face is real or a spoof (photo, video, 3D model):- Two models operate on different scales of the same image
- If a face is detected as a spoof, the label shows:
Name (Spoof: 0.95) - This helps prevent photo-based attacks
Troubleshooting
No faces detected
- Ensure adequate lighting
- Position your face clearly in the camera view
- Check that camera permissions are granted
Poor recognition accuracy
- Add more photos of the person from different angles
- Ensure photos are well-lit and clear
- Consider using flat search instead of HNSW for smaller datasets
- Try the 512-dimensional FaceNet model instead of the 128-dimensional version
Slow performance
- HNSW search is faster than flat search for larger datasets
- GPU acceleration is enabled by default for FaceNet inference
- Reduce the number of embeddings in the database if performance is critical