Skip to main content

Overview

Weaviate is an open-source vector database that supports semantic search, hybrid search (combining keyword and vector search), and automatic schema inference. Built for production ML applications.

Setup

Run Weaviate locally:
docker run -d \
  -p 8080:8080 \
  -e QUERY_DEFAULTS_LIMIT=25 \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  -e PERSISTENCE_DATA_PATH='/var/lib/weaviate' \
  semitechnologies/weaviate:latest
Configure in Flowise:
Weaviate Scheme: http
Weaviate Host: localhost:8080
Weaviate Index: MyCollection

Configuration

Required Parameters

weaviateScheme
string
required
Connection scheme:
  • https - For cloud or secured instances
  • http - For local development
weaviateHost
string
required
Weaviate server host (without scheme):
  • Local: localhost:8080
  • Cloud: your-cluster.weaviate.network
weaviateIndex
string
required
Class/collection name (capitalized, e.g., Documents, KnowledgeBase)
embeddings
Embeddings
required
Embedding model for vector generation

Optional Parameters

credential
credential
Weaviate API key credential (for cloud-hosted instances)
document
Document[]
Documents to upsert into the index
recordManager
RecordManager
Track indexed documents to prevent duplication
weaviateTextKey
string
default:"text"
Property name for storing document text
weaviateMetadataKeys
string
Array of metadata keys to store as properties:
["category", "author", "date"]
weaviateFilter
json
GraphQL-style filter for search:
{
  "path": ["category"],
  "operator": "Equal",
  "valueText": "documentation"
}
topK
number
default:4
Number of results to return

MMR (Maximal Marginal Relevance)

searchType
string
Search mode:
  • similarity - Standard vector similarity
  • mmr - Diverse results with MMR
fetchK
number
default:20
Documents to fetch before MMR reranking
lambda
number
MMR diversity factor (0-1)
alpha
number
default:1
Weighting for hybrid search:
  • 1 - Pure vector search
  • 0.5 - Balanced hybrid
  • 0 - Pure keyword (BM25) search

Usage Examples

Basic Local Setup

# Start Weaviate
docker run -d -p 8080:8080 \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  semitechnologies/weaviate:latest
// In Flowise
Weaviate Scheme: http
Weaviate Host: localhost:8080
Weaviate Index: Documents
Embeddings: OpenAI Embeddings
Top K: 4

Weaviate Cloud

// Cloud configuration
Weaviate Scheme: https
Weaviate Host: my-cluster.weaviate.network
Credential: Weaviate API Key
Weaviate Index: ProductionDocs
Embeddings: OpenAI Embeddings

With Metadata Keys

// Store specific metadata as properties
Weaviate Index: Articles
Weaviate Text Key: content
Weaviate Metadata Keys: ["author", "category", "publishDate"]
Embeddings: OpenAI Embeddings

Hybrid Search

// Combine keyword and semantic search
Weaviate Index: SearchableDocs
Alpha: 0.5  // 50% vector, 50% keyword
Top K: 10
Embeddings: OpenAI Embeddings

With Filters

// GraphQL-style filtering
{
  "weaviateFilter": {
    "path": ["category"],
    "operator": "Equal",
    "valueText": "tutorial"
  }
}

// Multiple conditions (AND)
{
  "weaviateFilter": {
    "operator": "And",
    "operands": [
      {
        "path": ["category"],
        "operator": "Equal",
        "valueText": "docs"
      },
      {
        "path": ["year"],
        "operator": "GreaterThanEqual",
        "valueInt": 2023
      }
    ]
  }
}

Weaviate Filter Syntax

Weaviate uses GraphQL-style filters:

Simple Filters

// Text equality
{
  "path": ["status"],
  "operator": "Equal",
  "valueText": "published"
}

// Numeric comparison
{
  "path": ["price"],
  "operator": "GreaterThan",
  "valueNumber": 10.0
}

// Boolean
{
  "path": ["featured"],
  "operator": "Equal",
  "valueBoolean": true
}

Operators

  • Text: Equal, NotEqual, Like
  • Numeric: Equal, NotEqual, GreaterThan, GreaterThanEqual, LessThan, LessThanEqual
  • Boolean: Equal, NotEqual
  • Geo: WithinGeoRange

Combining Filters

// AND
{
  "operator": "And",
  "operands": [
    { "path": ["type"], "operator": "Equal", "valueText": "article" },
    { "path": ["published"], "operator": "Equal", "valueBoolean": true }
  ]
}

// OR
{
  "operator": "Or",
  "operands": [
    { "path": ["priority"], "operator": "Equal", "valueText": "high" },
    { "path": ["urgent"], "operator": "Equal", "valueBoolean": true }
  ]
}

Best Practices

Schema Design

  • Use PascalCase for class names
  • Define metadata keys upfront
  • Use appropriate data types
  • Plan for cross-references

Hybrid Search

  • Start with alpha=0.5
  • Tune based on use case
  • Monitor search quality
  • Combine with filters

Performance

  • Use appropriate vector indexing
  • Enable compression for large datasets
  • Monitor shard count
  • Implement caching

Production

  • Enable authentication
  • Set up replication
  • Monitor resource usage
  • Implement backup strategy

Advanced Features

Custom Vector Index

// Configure HNSW parameters
{
  "vectorIndexConfig": {
    "ef": 100,
    "efConstruction": 128,
    "maxConnections": 64
  }
}

Multi-Tenancy

Weaviate supports multi-tenant collections for data isolation. Combine vector search with LLM generation for RAG applications.

Common Issues

Error: “Class name must start with capital letter”Solution:
  • Use PascalCase: Documents not documents
  • Weaviate enforces capitalized class names
  • Update Weaviate Index parameter
Can’t connect to WeaviateSolution:
  • Verify Weaviate is running
  • Check scheme (http vs https)
  • Ensure port 8080 is accessible
  • For cloud: verify API key
Metadata fields missing from resultsSolution:
  • Specify metadata keys in configuration
  • Ensure keys are valid property names
  • Check schema in Weaviate console
  • Re-index if schema changed
Alpha parameter has no effectSolution:
  • Ensure Weaviate version supports hybrid search
  • Verify text properties exist
  • Check if BM25 is enabled
  • Test with alpha=0 and alpha=1 separately

Monitoring

Weaviate provides REST API for monitoring:
# Health check
curl http://localhost:8080/v1/.well-known/ready

# Schema info
curl http://localhost:8080/v1/schema

# Class info
curl http://localhost:8080/v1/schema/Documents

# Metrics (Prometheus format)
curl http://localhost:8080/metrics

Outputs

retriever
VectorStoreRetriever
Retriever with configured search type and filters
vectorStore
WeaviateVectorStore
Direct vector store access for custom GraphQL queries

Build docs developers (and LLMs) love