Overview
The database serves two critical functions:- Relational Storage - User data, resumes, interview sessions, and metadata
- Vector Storage - Document embeddings for RAG (Retrieval-Augmented Generation) knowledge base
Quick Setup
Using Docker (Recommended)
The Docker Compose setup handles everything automatically:- PostgreSQL 16 with pgvector extension
- Database named
interview_guide - Automatic extension initialization
- Data persistence in
postgres_datavolume
Manual Setup
If you’re running PostgreSQL locally:Install pgvector extension
Follow the pgvector installation guide for your platform.
Connection Configuration
Spring Boot Properties
Database connection is configured inapplication.yml using environment variable substitution:
Environment Variables
PostgreSQL server hostname or IP address.Docker: Use the service name
postgres for container-to-container communication.Local: Use localhost or 127.0.0.1.PostgreSQL server port number.
Database name. Must exist before starting the application.
Database username for authentication.
Database password for authentication.
JPA and Hibernate Configuration
Schema Management
Theddl-auto setting controls how Hibernate manages the database schema:
Schema generation strategy.Options:
create- Drop and recreate tables on startup. Use only for first run or development.update- Update schema to match entities without dropping data. Recommended for production.validate- Validate schema matches entities, don’t make changes. Safest for production.none- Do nothing. Requires manual schema management.
Log all SQL statements to the console.Useful for debugging but verbose in production. Set to
true during development.SQL dialect for PostgreSQL-specific features.This enables PostgreSQL optimizations and native data types.
Pretty-print SQL statements in logs when
show-sql is enabled.Development vs Production
Vector Store Configuration
Spring AI’s pgvector integration is configured for the RAG knowledge base:Vector Store Properties
Vector index type for similarity search.Options:
HNSW- Hierarchical Navigable Small World (recommended). Fast approximate nearest neighbor search.IVFFlat- Inverted File Flat index. Good for smaller datasets.
Distance metric for vector similarity.Options:
COSINE_DISTANCE- Cosine similarity (recommended for text embeddings)EUCLIDEAN_DISTANCE- L2 distanceINNER_PRODUCT- Dot product similarity
Vector dimensionality. Must match the embedding model output.For Aliyun’s
text-embedding-v3 model, this is 1024 dimensions.Automatically create vector store tables on startup.Development: Set to
true for automatic setup.Production: Set to false and manage schema manually to prevent unexpected changes.Drop and recreate vector store table on startup.
Connection Pooling
Spring Boot uses HikariCP for connection pooling. The default configuration fromapplication.yml uses Redisson settings, but you can also tune HikariCP:
Connection Pool Tuning
Connection Pool Tuning
Maximum number of connections in the pool.Formula:
connections = ((core_count * 2) + effective_spindle_count)For most applications, 10-20 connections is sufficient.Minimum number of idle connections maintained.Set to same as
maximum-pool-size for fixed-size pools.Maximum milliseconds to wait for a connection from the pool.
Maximum milliseconds a connection can sit idle (5 minutes).
Maximum lifetime of a connection in the pool (20 minutes).Should be shorter than database connection timeout.
Database Initialization
The Docker setup includes an initialization script that runs on first startup:/docker-entrypoint-initdb.d/init.sql in the container.
Docker Compose Configuration
Manual SQL Setup
If you’re not using Docker, run these SQL commands:Production Checklist
Troubleshooting
Error: relation 'vector_store' does not exist
Error: relation 'vector_store' does not exist
The vector store table hasn’t been created. Solutions:
- Set
spring.ai.vectorstore.pgvector.initialize-schema: true - Or manually create the table using Spring AI’s schema
- Check that pgvector extension is installed:
SELECT * FROM pg_extension WHERE extname = 'vector';
Error: extension 'vector' does not exist
Error: extension 'vector' does not exist
The pgvector extension isn’t installed:
- Use the
pgvector/pgvector:pg16Docker image, OR - Install pgvector manually following the official guide
- Run
CREATE EXTENSION IF NOT EXISTS vector;in your database
Connection refused / timeout
Connection refused / timeout
PostgreSQL isn’t reachable:
- Verify PostgreSQL is running:
docker psorsystemctl status postgresql - Check firewall rules allow port 5432
- Verify
POSTGRES_HOSTmatches your setup (usepostgresin Docker,localhostlocally) - Check database logs for startup errors
All data deleted on restart
All data deleted on restart
You’re using
ddl-auto: create:- Change to
ddl-auto: updateinapplication.yml - Restart the application
- Consider using
validatein production
Vector search returns no results
Vector search returns no results
Possible causes:
- Documents haven’t been embedded yet - upload files to knowledge base
- Wrong embedding dimensions - verify
dimensions: 1024matches your model - Distance threshold too strict - check
min-scoresettings in RAG configuration - Index not built - allow time for HNSW index creation on large datasets
See Also
- Environment Variables - Database connection environment variables
- pgvector Documentation - Official pgvector extension guide
- Spring Data JPA - JPA configuration reference
- Spring AI Vector Stores - Spring AI vector database integration
