InterviewGuide uses pgvector (a PostgreSQL extension) for vector storage and similarity search. This enables RAG (Retrieval-Augmented Generation) for the knowledge base feature.
Spring AI automatically creates the vector_store table:
CREATE TABLE vector_store ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), content TEXT, -- Original text chunk metadata JSON, -- Metadata (kb_id, filename, etc.) embedding vector(1024) -- 1024-dim embedding vector);-- HNSW index for fast similarity searchCREATE INDEX vector_store_embedding_idx ON vector_store USING hnsw (embedding vector_cosine_ops);
In production, set initialize-schema: false and manage schema migrations manually to avoid accidental data loss.
Large documents are split into smaller chunks for better retrieval:
// modules/knowledgebase/service/KnowledgeBaseVectorService.java:23@Servicepublic class KnowledgeBaseVectorService { private static final int MAX_BATCH_SIZE = 10; // Alibaba Cloud limit private final VectorStore vectorStore; private final TextSplitter textSplitter; private final VectorRepository vectorRepository; public KnowledgeBaseVectorService(VectorStore vectorStore, VectorRepository vectorRepository) { this.vectorStore = vectorStore; this.vectorRepository = vectorRepository; // TokenTextSplitter: ~500 tokens per chunk, 50 token overlap this.textSplitter = new TokenTextSplitter(); }}
Why chunk overlap? Overlapping chunks ensure context isn’t lost at chunk boundaries. A 50-token overlap means the last 50 tokens of chunk N appear at the start of chunk N+1.
// modules/knowledgebase/repository/VectorRepository.java:16@Repositorypublic class VectorRepository { private final JdbcTemplate jdbcTemplate; @Transactional(rollbackFor = Exception.class) public int deleteByKnowledgeBaseId(Long knowledgeBaseId) { log.info("Deleting vector data for kbId={}", knowledgeBaseId); /* * PostgreSQL JSON query: * - metadata->>'kb_id' extracts kb_id as text * - Supports both string and numeric kb_id storage */ String sql = """ DELETE FROM vector_store WHERE metadata->>'kb_id' = ? OR (metadata->>'kb_id_long' IS NOT NULL AND (metadata->>'kb_id_long')::bigint = ?) """; try { int deletedRows = jdbcTemplate.update(sql, knowledgeBaseId.toString(), // String match knowledgeBaseId); // Numeric match if (deletedRows > 0) { log.info("Deleted {} vector rows for kbId={}", deletedRows, knowledgeBaseId); } else { log.info("No vector data found for kbId={}", knowledgeBaseId); } return deletedRows; } catch (Exception e) { log.error("Failed to delete vectors: kbId={}, error={}", knowledgeBaseId, e.getMessage()); throw new RuntimeException("删除向量数据失败", e); } }}
Metadata Type Handling: The query checks both kb_id (string) and kb_id_long (numeric) to handle different storage formats. This ensures compatibility across schema versions.
The HNSW (Hierarchical Navigable Small World) index balances speed and accuracy:
-- Default HNSW parameters (configured by Spring AI)CREATE INDEX vector_store_embedding_idx ON vector_store USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
m: Number of connections per layer (higher = more accurate, slower)
ef_construction: Search width during index build (higher = better quality, slower build)
Always batch embedding API calls to reduce latency. Alibaba Cloud text-embedding-v3 supports up to 10 texts per request.
// Good: Batch of 10vectorStore.add(chunks.subList(0, 10));// Bad: One at a timefor (Document chunk : chunks) { vectorStore.add(List.of(chunk)); // 10x slower!}