Overview
The Neo4j knowledge graph integration works alongside the vector store (Supabase) to provide relationship-based memory organization. While the vector store handles semantic similarity, the graph store captures explicit relationships between entities, events, and concepts.Neo4j integration is optional and configured through environment variables. If not configured, Tabby falls back to vector-only memory storage.
Configuration
Environment Variables
Configure Neo4j integration in your.env file:
Mem0 Configuration
The memory system automatically detects and configures Neo4j when credentials are provided:backend/main.py:60-68
How It Works
Dual Storage Strategy
Tabby uses a hybrid approach combining two complementary storage systems:Vector Store (Supabase)
Stores memory embeddings for semantic similarity search. Finds memories that are conceptually related based on meaning.
Entity & Relationship Extraction
When memories are added, Mem0 (powered by GPT-4.1-nano) automatically:- Extracts Entities: Identifies people, places, concepts, and objects
- Identifies Relationships: Determines how entities relate to each other
- Creates Graph Nodes: Stores entities as nodes in Neo4j
- Creates Graph Edges: Connects related entities with labeled relationships
Memory Classification Integration
The graph store works with memory type classification to organize knowledge:- LONG_TERM memories: Build persistent identity and preference graphs
- SHORT_TERM memories: Create temporary context nodes that can be pruned
- EPISODIC memories: Link events chronologically with temporal relationships
- SEMANTIC memories: Connect factual knowledge in concept hierarchies
- PROCEDURAL memories: Map step-by-step processes with ordered relationships
backend/main.py:78-137
Graph Schema
While Mem0 manages the specific schema, typical graph patterns include:Node Types
- Person: User identities, colleagues, contacts
- Concept: Technologies, tools, methodologies
- Event: Meetings, deadlines, milestones
- Preference: User likes, dislikes, settings
- Knowledge: Facts, definitions, learnings
Relationship Types
- KNOWS: Person-to-person connections
- PREFERS: User preferences
- WORKS_ON: Project/task relationships
- HAPPENED_BEFORE/AFTER: Temporal event ordering
- RELATED_TO: Conceptual connections
- PART_OF: Hierarchical relationships
Benefits of Graph Storage
1. Relationship-Based Retrieval
Find memories through connections:- “What projects is Sarah working on?” → Follow
WORKS_ONedges from Sarah node - “What happened before the product launch?” → Traverse
HAPPENED_BEFORErelationships
2. Multi-Hop Reasoning
Discover indirect connections:- User → KNOWS → Colleague → WORKS_ON → Project
- Enables queries like “Show me projects that people I know are working on”
3. Temporal Understanding
Track how information evolves:- Link episodic memories chronologically
- Understand sequences of events
- Build timelines of project progress
4. Contextual Clustering
Group related memories:- All memories about a specific person
- All memories related to a project
- All preferences in a category
Startup Logging
When the Memory API starts, it logs the graph store configuration:backend/main.py:70-74
Querying the Graph
Through Memory API
The Memory API endpoints automatically leverage both vector and graph stores:backend/main.py:222-243
Direct Neo4j Queries
For advanced use cases, you can query Neo4j directly using Cypher:Mem0 handles the graph schema and entity extraction automatically. Direct Cypher queries are optional for advanced analytics.
Performance Considerations
When to Use Neo4j
Use Neo4j when you need:- Complex relationship queries
- Multi-hop reasoning
- Timeline/chronological queries
- Entity-centric retrieval
- Knowledge graph visualization
- Simple semantic search
- Single-user personal notes
- Minimal relationship complexity
- Cost-sensitive deployments
Scaling
Neo4j graph stores scale differently than vector stores:- Vector Store: Scales with number of memories (embeddings)
- Graph Store: Scales with number of entities + relationships
- Query Performance: Graph queries are fastest for relationship traversal, vector queries for semantic similarity
Memory API Integration
The graph store is automatically used by all memory operations:Adding Memories
backend/main.py:189-218
Searching Memories
backend/main.py:222-243
Filtering by Type
Visualization
Neo4j provides built-in visualization tools:Neo4j Browser
Access the Neo4j Browser to visualize your knowledge graph:- Open your Neo4j instance URL in a browser
- Log in with your credentials
- Run Cypher queries to explore the graph
- Click nodes and edges to see properties
- Use the graph visualization to understand relationships
Example Visualization Query
The Brain Panel mentions “Neo4j knowledge graph visualization” in the roadmap (README.md:69). Direct in-app visualization may be added in future releases.
Best Practices
- Start with Vector-Only: Begin without Neo4j, add it when relationship queries become necessary
- Use Managed Neo4j: Services like Neo4j AuraDB handle scaling and backups
- Monitor Graph Size: Track node and relationship counts to understand storage growth
- Leverage Memory Types: Use classification metadata to organize graph nodes effectively
- Combine Search Methods: Use vector search for discovery, graph traversal for exploration
- Regular Backups: Export graph data periodically, especially for production deployments
Troubleshooting
Graph Store Not Enabled
If you seeGraph Store: disabled in startup logs:
- Check that
NEO4J_URLandNEO4J_PASSWORDare set in.env - Verify environment variables are loaded (
load_dotenv()in main.py:11) - Restart the Memory API after updating environment variables
Connection Errors
If Neo4j connection fails:- Verify Neo4j instance is running and accessible
- Check firewall rules allow connections to Neo4j port (usually 7687)
- Confirm credentials are correct
- For Neo4j AuraDB, ensure IP allowlist includes your server
Performance Issues
If graph queries are slow:- Create indexes on frequently queried properties
- Limit relationship depth in multi-hop queries
- Use
LIMITclauses to bound result sets - Consider caching frequent queries