Overview
Vector stores enable semantic search by storing text embeddings as VECTOR columns. The server supports:- Document Loading: Ingest documents from local filesystem or OCI Object Storage
- Vector Embeddings: Automatically generate embeddings using
ragify_column - RAG Queries: Natural language queries that retrieve relevant context and generate answers
- Dual Storage: Default vector store or custom InnoDB tables
Loading Documents
From Local Filesystem (MySQL AI)
For MySQL AI connections, load documents from thesecure_file_priv directory.
1. List Available Files
list_vector_store_files_local
2. Load Documents
load_vector_store_local
Parameters:
connection_id: Database connectionfile_path: Path withinsecure_file_privdirectory
- PDF documents
- Text files (.txt, .md)
- HTML files
From OCI Object Storage (MySQL HeatWave)
For MySQL HeatWave connections, load documents from OCI Object Storage buckets.1. List Buckets and Objects
object_storage_list_buckets, object_storage_list_objects
2. Load Documents with Prefix Filter
load_vector_store_oci
Parameters:
connection_id: HeatWave database connectionnamespace: OCI Object Storage namespacebucket_name: Source bucket namedocument_prefix: Prefix to filter objects (e.g., “manuals/”, “docs/2024/”)
Requires a valid OCI configuration file (
~/.oci/config). See Configuration for details.Creating Vector Embeddings
Using ragify_column
Convert text columns into vector embeddings for semantic search.ragify_column
Parameters:
connection_id: Database connectiontable_name: Target tableinput_column_name: Source text columnembedding_column_name: Target VECTOR column (created if doesn’t exist)
- Creates the embedding column if it doesn’t exist (as VECTOR type)
- Generates embeddings for all rows in the text column
- Populates the VECTOR column with embeddings
- Returns the number of rows processed
ragify_column:
The
ragify_column tool works on both MySQL AI and HeatWave connections with identical behavior.Performing RAG Queries
Query Default Vector Store
Use the default vector store created by document loading tools.ask_ml_rag_vector_store
Parameters:
connection_id: Database connectionquestion: Natural language questioncontext_size(optional): Number of context chunks to retrieve
- Converts your question into a vector embedding
- Performs similarity search in the vector store
- Retrieves the top N most relevant text segments
- Generates an answer using retrieved context and a language model
Query Custom InnoDB Tables
Perform RAG queries on specific InnoDB tables with custom columns.ask_ml_rag_innodb
Parameters:
connection_id: Database connectionquestion: Natural language questionsegment_col: Column containing text segmentsembedding_col: Column containing VECTOR embeddingscontext_size(optional): Number of context chunks
- You have multiple tables with embeddings
- You want to restrict search to a specific table
- You need fine-grained control over which embeddings to search
- MySQL AI: Vector search executes within the database service instance
- HeatWave: Table is loaded into the HeatWave cluster for distributed processing (much faster for large datasets)
Retrieve Segments Without Generation
Get raw text segments without LLM-generated answers.ask_ml_rag_vector_store (with skip_generate parameter)
Use Cases:
- Debugging vector search results
- Building custom generation pipelines
- Analyzing retrieved context quality
Vector Store Architecture
MySQL AI
MySQL HeatWave
Complete Workflow Example
MySQL AI Workflow
MySQL HeatWave Workflow
Best Practices
Document Preparation
- Structure: Use clear headings and sections
- Chunk Size: Keep paragraphs focused (200-500 words)
- Metadata: Include titles and categories for better retrieval
- Format: PDF and text formats work best
Vector Search Optimization
- Context Size: Start with 3-5 chunks, adjust based on results
- Table Design: Create separate tables for different document types
- Embeddings: Re-run
ragify_columnwhen source text changes - Indexing: Consider adding indexes on metadata columns
Performance Tips
- MySQL AI: Best for moderate data sizes (< 1M rows)
- HeatWave: Use for large datasets requiring high-performance search
- Batch Loading: Load multiple documents at once using prefix filters
- Regular Updates: Schedule periodic document refreshes
Common Questions
Can I use both local and OCI storage with the same connection?
Can I use both local and OCI storage with the same connection?
No. MySQL AI connections use local filesystem loading, while HeatWave connections use OCI Object Storage. The server automatically detects your connection type and provides appropriate tools.
How do I update documents in the vector store?
How do I update documents in the vector store?
For the default vector store, re-run the load command to refresh documents. For custom InnoDB tables, update the source text column and re-run
ragify_column to regenerate embeddings.What happens if I query an empty vector store?
What happens if I query an empty vector store?
The RAG tools will return an error or empty result indicating no documents are available. Load documents first using the appropriate loading tool for your connection type.
Can I have multiple vector stores?
Can I have multiple vector stores?
Yes. The default vector store is managed automatically, but you can create multiple custom InnoDB tables with VECTOR columns using
ragify_column for different document collections.How do I improve RAG answer quality?
How do I improve RAG answer quality?
- Increase
context_sizeto retrieve more segments - Improve source document quality and structure
- Use more specific questions
- Create separate vector stores for different topics
- Regularly update embeddings when source content changes
Next Steps
ML & GenAI
Explore text generation and NL2SQL features
API Tools
View complete tool reference
