Skip to main content

Overview

The MerkleTree provides cryptographic verification of vector collections. The root hash (32 bytes) is posted to Solana, allowing anyone to verify that a specific vector ID is part of a collection without revealing the entire collection.

Use Case

Merkle trees enable:
  • Proof of inclusion: Verify a vector is in a collection
  • Compact verification: Only the root hash needs to be stored on-chain
  • Privacy: Collection contents remain private
  • Integrity: Tamper-proof verification

Type Signatures

pub struct MerkleTree {
    leaves: Vec<[u8; 32]>,
    tree: Vec<Vec<[u8; 32]>>,
    original_ids: Vec<String>,
}

pub struct MerkleProof {
    pub vector_id: String,
    pub leaf_hash: [u8; 32],
    pub proof_nodes: Vec<ProofNode>,
    pub root: [u8; 32],
}

pub struct ProofNode {
    pub hash: [u8; 32],
    pub position: NodePosition,
}

pub enum NodePosition {
    Left,
    Right,
}

Constructor

new

Build a Merkle tree from a list of vector IDs.
pub fn new(vector_ids: &[String]) -> Self
vector_ids
&[String]
required
Array of vector IDs to include in the tree. Order matters for proof generation.
MerkleTree
MerkleTree
Returns a new Merkle tree with computed root hash.
Example:
use solvec_core::merkle::MerkleTree;

let vector_ids = vec![
    "vec_0".to_string(),
    "vec_1".to_string(),
    "vec_2".to_string(),
    "vec_3".to_string(),
];

let tree = MerkleTree::new(&vector_ids);

Core Methods

root

Get the Merkle root — this 32-byte value goes on Solana.
pub fn root(&self) -> [u8; 32]
[u8; 32]
[u8; 32]
The 32-byte SHA-256 root hash of the tree. This is what gets posted to Solana.
Example:
let root = tree.root();
println!("Root hash: {:?}", root);

root_hex

Get root as hex string (for display and logging).
pub fn root_hex(&self) -> String
String
String
Hexadecimal string representation of the root hash.
Example:
let root_hex = tree.root_hex();
println!("Root hash: {}", root_hex);
// Output: "a3f5b8c2d1e4f6a7b9c0d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2"

generate_proof

Generate a Merkle proof that a given vector ID is in this collection. The proof can be verified by anyone with just the root.
pub fn generate_proof(&self, vector_id: &str) -> Option<MerkleProof>
vector_id
&str
required
The vector ID to generate a proof for.
Option<MerkleProof>
Option<MerkleProof>
Returns Some(MerkleProof) if the vector ID exists in the tree, None otherwise.
Example:
let proof = tree.generate_proof("vec_2");
if let Some(proof) = proof {
    println!("Generated proof for vec_2 with {} nodes", proof.proof_nodes.len());
} else {
    println!("vec_2 not found in tree");
}

vector_count

Number of vectors in this tree.
pub fn vector_count(&self) -> usize
usize
usize
Total number of vector IDs in the tree.
Example:
println!("Tree contains {} vectors", tree.vector_count());

MerkleProof Methods

verify

Verify this proof against a given root. Returns true if the vector_id is provably in the collection with that root.
impl MerkleProof {
    pub fn verify(&self, expected_root: &[u8; 32]) -> bool
}
expected_root
&[u8; 32]
required
The root hash to verify against (typically retrieved from Solana).
bool
bool
Returns true if the proof is valid and the vector ID is in the collection, false otherwise.
Example:
let root = tree.root();
let proof = tree.generate_proof("vec_2").unwrap();

assert!(proof.verify(&root)); // Valid proof

let wrong_root = [1u8; 32];
assert!(!proof.verify(&wrong_root)); // Invalid proof

root_hex

Get root as hex string.
impl MerkleProof {
    pub fn root_hex(&self) -> String
}
String
String
Hexadecimal string representation of the root hash stored in the proof.

Complete Example: VecLabs Integration

use solvec_core::merkle::MerkleTree;
use solvec_core::hnsw::HNSWIndex;
use solvec_core::types::{DistanceMetric, Vector};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Step 1: Build HNSW index
    let mut index = HNSWIndex::new(16, 200, DistanceMetric::Cosine);

    let vector_ids = vec![
        "user_alex_intro",
        "user_alex_startup",
        "user_bob_intro",
        "session_summary",
    ];

    let vectors = vec![
        vec![0.9, 0.1, 0.0, 0.0],
        vec![0.8, 0.2, 0.1, 0.0],
        vec![0.0, 0.0, 0.9, 0.1],
        vec![0.5, 0.5, 0.0, 0.0],
    ];

    for (id, values) in vector_ids.iter().zip(vectors.iter()) {
        index.insert(Vector::new(*id, values.clone()))?;
    }

    // Step 2: Build Merkle tree from vector IDs
    let ids: Vec<String> = vector_ids.iter().map(|s| s.to_string()).collect();
    let tree = MerkleTree::new(&ids);

    // Step 3: Get root hash (this goes to Solana)
    let root = tree.root();
    let root_hex = tree.root_hex();
    println!("Merkle root: {} (→ Solana)", &root_hex[..16]);

    // Step 4: Generate proof for a specific vector
    let proof = tree.generate_proof("user_alex_intro").unwrap();
    println!("Generated proof with {} nodes", proof.proof_nodes.len());

    // Step 5: Verify proof (anyone can do this with just the root)
    assert!(proof.verify(&root));
    println!("✅ Proof verified: user_alex_intro is in collection");

    // Step 6: Try invalid proof
    let wrong_root = [1u8; 32];
    assert!(!proof.verify(&wrong_root));
    println!("✅ Invalid root rejected");

    // Step 7: Verify all vectors
    for id in &ids {
        let proof = tree.generate_proof(id).unwrap();
        assert!(proof.verify(&root), "Proof failed for {}", id);
    }
    println!("✅ All proofs verified");

    Ok(())
}

Integration Test Example

From the VecLabs full pipeline integration test:
// Build Merkle tree
let ids: Vec<String> = test_vectors.iter().map(|(id, _)| id.to_string()).collect();
let tree = MerkleTree::new(&ids);
let root = tree.root();
let root_hex = tree.root_hex();

println!("✅ Step 3: Merkle root computed: {}", &root_hex[..16]);
assert_ne!(root, [0u8; 32]);

// Generate and verify proof
let proof = tree.generate_proof("user_alex_intro").unwrap();
assert!(proof.verify(&root));
println!("✅ Step 3: Merkle proof verified for 'user_alex_intro'");

Serialization

Both MerkleTree and MerkleProof implement Serialize and Deserialize from serde:
use serde_json;

// Serialize proof to JSON
let proof_json = serde_json::to_string(&proof)?;

// Send proof to client or store in database
let stored_proof: MerkleProof = serde_json::from_str(&proof_json)?;

// Verify after deserialization
assert!(stored_proof.verify(&root));

Internal Implementation

The Merkle tree uses SHA-256 for hashing:
// Leaf hashing (prevents length extension attacks)
hash_leaf(data) = SHA256("leaf:" || data)

// Node hashing
hash_pair(left, right) = SHA256("node:" || left || right)

Edge Cases

Empty Tree

let tree = MerkleTree::new(&[]);
assert_eq!(tree.root(), [0u8; 32]); // All zeros
assert_eq!(tree.vector_count(), 0);

Single Element

let tree = MerkleTree::new(&["single".to_string()]);
let root = tree.root();
assert_ne!(root, [0u8; 32]); // Valid hash

let proof = tree.generate_proof("single").unwrap();
assert!(proof.verify(&root));

Nonexistent Vector

let tree = MerkleTree::new(&["vec_0".to_string()]);
let proof = tree.generate_proof("nonexistent");
assert!(proof.is_none());

Performance Characteristics

  • Tree construction: O(N) where N is number of vectors
  • Proof generation: O(log N)
  • Proof verification: O(log N)
  • Proof size: O(log N) hashes (32 bytes each)
  • Space: O(N) for storing tree

Security Properties

Collision Resistance

Uses SHA-256, computationally infeasible to find collisions

Tamper Evidence

Any modification to a vector ID changes the root hash

Compact Proofs

Only O(log N) hashes needed to verify inclusion

Public Verification

Anyone with the root can verify proofs

Use in VecLabs Pipeline

  1. Index Creation: Build HNSW index with vector IDs
  2. Tree Construction: Create Merkle tree from vector IDs
  3. Root Commitment: Post root hash to Solana (32 bytes on-chain)
  4. Proof Generation: Create proofs for query results
  5. Client Verification: Users verify results against on-chain root

See Also

Build docs developers (and LLMs) love