Skip to main content
FaceNet Android uses ObjectBox vector database to store face embeddings and perform nearest neighbor search. You can choose between approximate (ANN) and precise (flat) search modes.

Search methods

HNSW approximate search (default)

ObjectBox uses Hierarchical Navigable Small World (HNSW) graphs for approximate nearest neighbor (ANN) search. This method:
  • Searches quickly by traversing graph connections
  • Returns approximate nearest neighbors
  • Scales well with large datasets (1000+ faces)
  • Uses cosine distance for similarity comparison
Flat search performs linear scanning of all embeddings to find exact nearest neighbors. This method:
  • Computes cosine similarity with every record
  • Returns the true nearest neighbor
  • Parallelizes across 4 threads for better performance
  • Becomes slower as dataset size grows
For datasets under 500 faces, flat search provides better accuracy with acceptable performance. For larger datasets, HNSW search is recommended.

Configuring search mode

The search mode is configured in FaceDetectionOverlay.kt. To enable flat search, modify the flatSearch parameter:
FaceDetectionOverlay.kt
@SuppressLint("ViewConstructor")
@ExperimentalGetImage
class FaceDetectionOverlay(
    private val lifecycleOwner: LifecycleOwner,
    private val context: Context,
    private val viewModel: DetectScreenViewModel,
) : FrameLayout(context) {
    
    // Set to true for precise search, false for HNSW ANN
    private val flatSearch: Boolean = false
    
    // ...
}
Changing the search mode requires recompiling the app. This is a compile-time constant, not a runtime setting.

Implementation details

The search logic is implemented in ImagesVectorDB.kt:17-61:
ImagesVectorDB.kt
fun getNearestEmbeddingPersonName(
    embedding: FloatArray,
    flatSearch: Boolean,
): FaceImageRecord? {
    if (flatSearch) {
        // Flat search: parallel linear scan
        val allRecords = imagesBox.all
        val numThreads = 4
        val batchSize = allRecords.size / numThreads
        val batches = allRecords.chunked(batchSize)
        val results = runBlocking {
            batches.map { batch ->
                async(Dispatchers.Default) {
                    var bestMatch: FaceImageRecord? = null
                    var bestDistance = Float.NEGATIVE_INFINITY
                    for (record in batch) {
                        val distance = cosineDistance(embedding, record.faceEmbedding)
                        if (distance > bestDistance) {
                            bestDistance = distance
                            bestMatch = record
                        }
                    }
                    Pair(bestMatch, bestDistance)
                }
            }.awaitAll()
        }
        return results.maxByOrNull { it.second }?.first
    }
    
    // HNSW ANN search
    return imagesBox
        .query(FaceImageRecord_.faceEmbedding.nearestNeighbors(embedding, 10))
        .build()
        .findWithScores()
        .map { it.get() }
        .firstOrNull()
}

Flat search parallelization

The flat search implementation divides records into 4 batches and processes them concurrently:
  1. Splits all records into 4 equal chunks
  2. Launches coroutines on Dispatchers.Default
  3. Each coroutine finds the best match in its batch
  4. Returns the overall best match across all batches

HNSW search configuration

The HNSW index is configured in DataModels.kt:18-21:
DataModels.kt
@HnswIndex(
    dimensions = 512,  // Must match FaceNet model output
    distanceType = VectorDistanceType.COSINE,
) var faceEmbedding: FloatArray = floatArrayOf()
The search retrieves 10 nearest neighbors for quality improvement:
FaceImageRecord_.faceEmbedding.nearestNeighbors(embedding, 10)
Retrieving multiple neighbors (10) and using the first result improves search quality. This is the “ef” parameter in HNSW that balances quality and performance.

Cosine distance calculation

Both search modes use cosine distance for similarity comparison in ImagesVectorDB.kt:63-78:
ImagesVectorDB.kt
private fun cosineDistance(
    x1: FloatArray,
    x2: FloatArray,
): Float {
    var mag1 = 0.0f
    var mag2 = 0.0f
    var product = 0.0f
    for (i in x1.indices) {
        mag1 += x1[i] * x1[i]
        mag2 += x2[i] * x2[i]
        product += x1[i] * x2[i]
    }
    mag1 = kotlin.math.sqrt(mag1)
    mag2 = kotlin.math.sqrt(mag2)
    return product / (mag1 * mag2)
}
Cosine distance ranges from -1 to 1, where 1 indicates identical embeddings.

Performance metrics

The app displays vector search timing on the main screen. Typical performance:
Search Mode100 faces500 faces1000 faces
HNSW ANN~2-5ms~3-7ms~5-10ms
Flat (4 threads)~8-15ms~35-60ms~80-150ms
Flat search performance degrades linearly with dataset size. With 5000+ faces, flat search may cause noticeable lag in real-time recognition.

Choosing the right method

Use HNSW (flatSearch = false) when:
  • You have 500+ faces in the database
  • You need real-time recognition performance
  • Slight accuracy tradeoffs are acceptable
Use flat search (flatSearch = true) when:
  • You need maximum accuracy
  • Your dataset is under 500 faces
  • You can tolerate 50-100ms search latency

Build docs developers (and LLMs) love