Skip to main content
VecLabs is built differently than traditional vector databases. Instead of storing everything on centralized servers, we use a three-layer architecture where each layer does only what it is best at.

The Three Layers

SolVec SDK (TypeScript / Python)
.upsert()  .query()  .delete()  .verify()
      |           |           |
      v           v           v
 Rust HNSW    Shadow Drive   Solana
 (in memory)  (encrypted     (32-byte
 sub-5ms p99   vectors)       Merkle root)
 Speed Layer  Storage Layer   Trust Layer

Speed Layer

Rust HNSW runs in-memory with no garbage collector. Delivers consistent sub-5ms p99 latency that Python and Go-based engines cannot match under concurrent load.

Storage Layer

Shadow Drive stores encrypted vectors on decentralized infrastructure. AES-256-GCM encryption with wallet-derived keys means VecLabs cannot read your data.

Trust Layer

Solana stores a 32-byte Merkle root after every write. Immutable, timestamped, and publicly verifiable proof of your collection’s state.

Why This Design Wins

Rust HNSW Core

The query engine runs entirely in Rust with no garbage collector. This is a deliberate technical decision, not a trendy language choice. Why Rust matters:
  • Python and Go have garbage collectors that introduce unpredictable latency spikes under load
  • Rust has no GC, meaning consistent, predictable sub-millisecond latency at scale
  • At 100K vectors (384 dimensions), VecLabs delivers 1.9ms p50, 4.3ms p99
  • These numbers don’t degrade under concurrent query load
// Core HNSW parameters from hnsw.rs:42-50
pub struct HNSWIndex {
    m: usize,              // max connections per node (default: 16)
    m_max_0: usize,        // max at layer 0 (default: 32)
    ef_construction: usize, // build beam width (default: 200)
    ef_search: usize,       // query beam width (default: 50)
    ml: f64,               // level multiplier
    // ...
}
The HNSW implementation in solvec-core/src/hnsw.rs includes:
  • 31 unit tests covering insert, delete, update, query operations
  • Full serialization support for persistence
  • Three distance metrics: cosine, euclidean, dot product
  • Bidirectional graph connections with automatic pruning
Performance Benchmark: On Apple M2 with 100K vectors (384 dims), VecLabs achieves p50=1.9ms, p95=2.8ms, p99=4.3ms. Full methodology in /benchmarks/COMPARISON.md.

Shadow Drive Storage

Vectors are encrypted client-side before leaving the SDK. The encryption key is derived from your Solana wallet, meaning:
  1. VecLabs cannot decrypt your data — we never see the plaintext vectors
  2. Storage cost is ~$0.000039 per MB per epoch — approximately 88% cheaper than Pinecone
  3. No cloud markup — you’re paying for decentralized storage, not our infrastructure overhead
From encryption.rs:12-35, vectors are encrypted using AES-256-GCM:
pub fn encrypt_vectors(vectors: &[Vec<f32>], key: &[u8; 32]) -> Result<Vec<u8>, SolVecError> {
    let cipher = Aes256Gcm::new(Key::<Aes256Gcm>::from_slice(key));
    let nonce = Aes256Gcm::generate_nonce(&mut OsRng);
    
    // Serialize: num_vectors (8 bytes) + dimension (8 bytes) + vector data
    let mut plaintext = Vec::new();
    plaintext.extend_from_slice(&num_vectors.to_le_bytes());
    plaintext.extend_from_slice(&dim.to_le_bytes());
    for v in vectors {
        for &f in v {
            plaintext.extend_from_slice(&f.to_le_bytes());
        }
    }
    
    let ciphertext = cipher.encrypt(&nonce, plaintext.as_ref())?;
    // Return: nonce (12 bytes) + ciphertext
}

Solana Trust Layer

Blockchain is not a storage layer — it’s a trust layer. You don’t put 1,536 float32 values on Solana. You put a 32-byte Merkle root on Solana. After every write operation:
  1. SDK builds a Merkle tree from all vector IDs in the collection
  2. Computes the SHA-256 Merkle root (32 bytes)
  3. Posts the root to Solana via Anchor program
  4. Transaction finalizes in ~400ms with $0.00025 cost
The result: immutable, timestamped, publicly verifiable proof that a specific set of data existed at a specific point in time, owned by a specific wallet.
Shadow Drive persistence is currently in progress. Vectors are stored in-memory during alpha. The Solana program is live on devnet at 8xjQ2XrdhR4JkGAdTEB7i34DBkbrLRkcgchKjN1Vn5nP.

Data Flow

Upsert Operation

Query Operation

Queries run entirely against the in-memory Rust HNSW index. No network calls. No decryption overhead. This is why latency stays sub-5ms even at 100K vectors.

Verification

Anyone can verify the collection state without trusting VecLabs. The proof is cryptographic and the root is permanently on-chain.

Cost Comparison

Cost ComponentVecLabsPinecone s1
1M vectors storage~$0.04/month (Shadow Drive)$70/month
Merkle root updates~$0.00025/tx (Solana)Included in pod cost
Query computeRust binary on your infraPinecone cloud
Total (1M vectors)~$8-20/month$70/month
VecLabs is 60-88% cheaper because we don’t mark up cloud compute. You’re paying for decentralized storage and on-chain proofs, not our server racks.

Component Status

Production Ready

  • Rust HNSW core (31 tests passing)
  • AES-256-GCM encryption
  • Merkle tree + proof generation
  • Solana Anchor program (devnet)
  • TypeScript SDK (alpha)
  • Python SDK (alpha)

In Progress

  • Shadow Drive persistence
  • WASM Rust bridge
  • Agent memory demo
  • LangChain integration
  • Mainnet deployment

Why This Architecture Matters

For AI engineers: You get Pinecone-compatible API with better performance and lower cost. For enterprise teams: You get cryptographic proof of data provenance — critical for healthcare, legal, and financial AI where “what did the agent know and when” is a compliance requirement. For decentralization: No single point of failure. Your data isn’t on our servers. The history is on-chain. If VecLabs disappeared tomorrow, your collections remain verifiable and accessible.

Next: Dive into HNSW

Learn how the Hierarchical Navigable Small World algorithm delivers sub-5ms queries.

Build docs developers (and LLMs) love