Metadb
A powerful system for synchronizing databases for analytics applications with continuous data streaming, transforms, and historical tracking.
What is Metadb?
Metadb extends PostgreSQL with features specifically designed for analytics workloads. It continuously synchronizes data from external sources (like transaction-processing databases or sensor networks) and maintains both current state and complete historical records. The system is built for scenarios where you need to:- Track how data changes over time
- Query historical states at any point
- Transform and flatten JSON or MARC data automatically
- Support multiple concurrent data sources
- Maintain a PostgreSQL-compatible query interface
Key Features
Continuous Synchronization
Stream data from Kafka sources with automatic schema detection and type inference
Historical Tracking
Every row includes temporal metadata (__start, __end, __current) for point-in-time queries
Data Transformation
Automatically flatten JSON objects and arrays into queryable columns
PostgreSQL Compatible
Query using standard SQL through a PostgreSQL-compatible interface
User Workspaces
Individual schemas for each user to create tables and save query results
Access Control
Granular privileges that persist across table recreations
How It Works
Stream Data Automatically
Metadb reads change events from Kafka and creates tables automatically. Each table includes metadata columns for historical tracking.
Query Current and Historical Data
Query current data using base tables, or access complete history using main tables:
Use Cases
Library Management Analytics
Library Management Analytics
Track circulation patterns, patron behavior, and collection usage over time. Metadb was designed with FOLIO library systems integration in mind.
Time-Series Analysis
Time-Series Analysis
Query data as it existed at any point in time. Perfect for auditing, compliance reporting, and understanding how metrics evolved.
Data Warehouse Sync
Data Warehouse Sync
Keep an analytics database continuously synchronized with production systems without impacting source database performance.
Multi-Source Integration
Multi-Source Integration
Combine data from multiple sources into unified tables using the __origin column to track provenance.
Architecture
Metadb sits between your data sources and analytics users:- Source Database → PostgreSQL with logical decoding enabled
- Kafka Connect/Debezium → Captures change events from source
- Kafka → Streams change events
- Metadb → Processes events and maintains synchronized database
- PostgreSQL → Stores current and historical data
- Analytics Users → Query via PostgreSQL-compatible interface
Metadb is not a database system itself — it manages data in PostgreSQL and provides a PostgreSQL-compatible query interface on port 8550 (default).
Next Steps
Quickstart Guide
Get Metadb up and running in minutes
Installation
Build and configure Metadb for your environment
Core Concepts
Learn about data sources, table types, and transformations
CLI Reference
Explore all available commands
