Overview
Neo4j is a native graph database that stores and queries data as nodes and relationships. In PentAGI, it serves as:- Knowledge Storage: Persistent graph database for entities and relationships
- Relationship Querying: Fast traversal of complex entity connections
- Pattern Matching: Cypher query language for graph patterns
- Temporal Tracking: Time-based relationship management
- Visualization: Built-in browser for graph exploration
Architecture
Neo4j in the PentAGI stack:Setup
Configuration
Docker Compose Settings
Neo4j service configuration:docker-compose-graphiti.yml
Environment Variables
Key Neo4j configuration options:| Variable | Description | Default |
|---|---|---|
NEO4J_AUTH | Authentication (user/password) | neo4j/devpassword |
NEO4J_dbms_memory_heap_initial__size | Initial heap size | 512m |
NEO4J_dbms_memory_heap_max__size | Maximum heap size | 1G |
NEO4J_dbms_memory_pagecache_size | Page cache size | 512m |
NEO4J_dbms_security_procedures_unrestricted | Allowed procedures | gds.* |
Performance Tuning
For production deployments, increase memory limits:docker-compose-graphiti.yml
Cypher Query Language
Neo4j uses Cypher for querying graph data.Basic Queries
Create a node:Pattern Matching
Find related entities:Aggregation
Count and aggregate:Graph Algorithms
Shortest path:Usage
Neo4j Browser
The built-in browser provides:- Query Editor: Write and execute Cypher queries
- Graph Visualization: Interactive node and relationship display
- Data Browser: Explore database schema and contents
- Query History: Review previous queries
- Favorites: Save frequently-used queries
Command Line Access
Usecypher-shell for CLI queries:
Python Client
Query Neo4j from Python:Maintenance
Backup
Backup Neo4j data:Restore
Restore from backup:Indexes
Create indexes for better performance:Constraints
Ensure data integrity:Monitoring
Database Metrics
Query database statistics:Performance Profiling
Profile slow queries:Logs
View Neo4j logs:Troubleshooting
Connection Issues
Verify Neo4j is accessible:Authentication Errors
Reset password:Performance Issues
Diagnose slow queries:-
Enable query logging:
-
Analyze query plans with
PROFILE - Add missing indexes
- Increase memory allocation
Data Corruption
Recover from corruption:Best Practices
Schema Design
- Use meaningful node labels and relationship types
- Normalize properties across similar nodes
- Avoid deeply nested queries (> 5 levels)
- Use indexes on frequently queried properties
- Model relationships as first-class entities
Query Optimization
- Always use indexes for lookups
- Limit result sets with
LIMIT - Use
WITHto pipeline queries - Avoid Cartesian products
- Profile queries before production
Security
- Change default password immediately
- Use strong passwords (16+ characters)
- Restrict network access to trusted IPs
- Enable TLS/SSL in production
- Regularly update Neo4j version
Data Management
- Regular backups (daily minimum)
- Monitor disk usage
- Archive old data periodically
- Clean up unused nodes and relationships
- Document schema and queries
Related Documentation
- Graphiti - Knowledge graph system
- Memory Systems - AI agent memory
- Knowledge Graph - Graph concepts and usage