Skip to main content
The ImagesVectorDB class provides vector database operations for storing and searching face embeddings using ObjectBox with HNSW (Hierarchical Navigable Small World) indexing.

Constructor

The class is annotated with @Single and managed by Koin dependency injection.
ImagesVectorDB()
Uses ObjectBox store to manage FaceImageRecord entities with vector search capabilities.

Methods

addFaceImageRecord

Adds a new face embedding record to the database.
fun addFaceImageRecord(record: FaceImageRecord)
record
FaceImageRecord
required
Face image record containing person ID, name, and 512-dimensional embedding vector

getNearestEmbeddingPersonName

Finds the most similar face embedding in the database using either ANN (Approximate Nearest Neighbor) or linear search.
fun getNearestEmbeddingPersonName(
    embedding: FloatArray,
    flatSearch: Boolean
): FaceImageRecord?
embedding
FloatArray
required
Query embedding vector (512 dimensions) to search for
  • true: Use linear search for precise results (slower, exhaustive search)
  • false: Use ObjectBox HNSW index for approximate results (faster)
return
FaceImageRecord?
The face record with the highest cosine similarity, or null if no records exist

removeFaceRecordsWithPersonID

Deletes all face embeddings associated with a specific person.
fun removeFaceRecordsWithPersonID(personID: Long)
personID
Long
required
Person ID to remove all face records for

Search algorithms

When flatSearch = false, uses ObjectBox’s HNSW index for fast approximate nearest neighbor search.
imagesBox
    .query(FaceImageRecord_.faceEmbedding.nearestNeighbors(embedding, 10))
    .build()
    .findWithScores()
    .map { it.get() }
    .firstOrNull()
maxResultCount
Int
default:"10"
Maximum number of candidates to retrieve. Also serves as the “ef” HNSW parameter for search quality. Higher values improve quality at the cost of performance
distanceType
VectorDistanceType
default:"COSINE"
Uses cosine distance for similarity measurement, configured in FaceImageRecord
When flatSearch = true, performs exhaustive linear search across all records with parallel processing.
numThreads
Int
default:"4"
Number of parallel threads for batch processing
algorithm
String
Splits all records into batches, processes in parallel using coroutines on Dispatchers.Default, then returns the record with maximum cosine similarity

Cosine distance calculation

Calculates cosine similarity between two embedding vectors:
private fun cosineDistance(x1: FloatArray, x2: FloatArray): Float {
    var mag1 = 0.0f
    var mag2 = 0.0f
    var product = 0.0f
    for (i in x1.indices) {
        mag1 += x1[i] * x1[i]
        mag2 += x2[i] * x2[i]
        product += x1[i] * x2[i]
    }
    mag1 = sqrt(mag1)
    mag2 = sqrt(mag2)
    return product / (mag1 * mag2)
}
Returns a value between -1 and 1, where:
  • 1 = identical vectors
  • 0 = orthogonal vectors
  • -1 = opposite vectors

Usage example

// Add a new face to the database
val embedding = faceNet.getFaceEmbedding(croppedFaceBitmap)
imagesVectorDB.addFaceImageRecord(
    FaceImageRecord(
        personID = personID,
        personName = "John Doe",
        faceEmbedding = embedding
    )
)

// Search for similar face using HNSW (fast)
val queryEmbedding = faceNet.getFaceEmbedding(unknownFace)
val result = imagesVectorDB.getNearestEmbeddingPersonName(
    embedding = queryEmbedding,
    flatSearch = false
)

if (result != null) {
    val distance = cosineDistance(queryEmbedding, result.faceEmbedding)
    if (distance > 0.3) {
        println("Recognized: ${result.personName}")
    } else {
        println("Not recognized")
    }
}

// Search using linear search (precise)
val preciseResult = imagesVectorDB.getNearestEmbeddingPersonName(
    embedding = queryEmbedding,
    flatSearch = true
)

// Delete all faces for a person
imagesVectorDB.removeFaceRecordsWithPersonID(personID)

Performance considerations

HNSW search (flatSearch = false)

  • Faster: O(log n) average case
  • Approximate results
  • Suitable for real-time recognition
  • Quality can be tuned with maxResultCount parameter

Linear search (flatSearch = true)

  • Slower: O(n) complexity
  • Exact results
  • Uses parallel processing with 4 threads
  • Better for smaller databases or when precision is critical
Source: data/ImagesVectorDB.kt:10

Build docs developers (and LLMs) love