Metadb

A powerful system for synchronizing databases for analytics applications with continuous data streaming, transforms, and historical tracking.

Get Started View on GitHub

What is Metadb?

Metadb extends PostgreSQL with features specifically designed for analytics workloads. It continuously synchronizes data from external sources (like transaction-processing databases or sensor networks) and maintains both current state and complete historical records. The system is built for scenarios where you need to:

Track how data changes over time
Query historical states at any point
Transform and flatten JSON or MARC data automatically
Support multiple concurrent data sources
Maintain a PostgreSQL-compatible query interface

Key Features

Continuous Synchronization

Stream data from Kafka sources with automatic schema detection and type inference

Historical Tracking

Every row includes temporal metadata (__start, __end, __current) for point-in-time queries

Data Transformation

Automatically flatten JSON objects and arrays into queryable columns

PostgreSQL Compatible

Query using standard SQL through a PostgreSQL-compatible interface

User Workspaces

Individual schemas for each user to create tables and save query results

Access Control

Granular privileges that persist across table recreations

How It Works

Configure a Data Source

Define a Kafka data source with connection settings and schema filters:

create data source sensor type kafka options (
    brokers 'kafka:29092',
    topics '^metadb_sensor_1\.',
    consumer_group 'metadb_sensor_1_1',
    add_schema_prefix 'sensor_'
);

Stream Data Automatically

Metadb reads change events from Kafka and creates tables automatically. Each table includes metadata columns for historical tracking.

Query Current and Historical Data

Query current data using base tables, or access complete history using main tables:

-- Current records only
select * from library.patrongroup;

-- All historical records
select * from library.patrongroup__;

Transform Complex Data

Configure JSON transformations to extract nested fields into columns:

create data mapping for json
    from table library.inventory__ column jsondata path '$'
    to 't';

Use Cases

Library Management Analytics

Track circulation patterns, patron behavior, and collection usage over time. Metadb was designed with FOLIO library systems integration in mind.

Time-Series Analysis

Query data as it existed at any point in time. Perfect for auditing, compliance reporting, and understanding how metrics evolved.

Data Warehouse Sync

Keep an analytics database continuously synchronized with production systems without impacting source database performance.

Multi-Source Integration

Combine data from multiple sources into unified tables using the __origin column to track provenance.

Architecture

Metadb sits between your data sources and analytics users:

Source Database → PostgreSQL with logical decoding enabled
Kafka Connect/Debezium → Captures change events from source
Kafka → Streams change events
Metadb → Processes events and maintains synchronized database
PostgreSQL → Stores current and historical data
Analytics Users → Query via PostgreSQL-compatible interface

Metadb is not a database system itself — it manages data in PostgreSQL and provides a PostgreSQL-compatible query interface on port 8550 (default).

Next Steps

Quickstart Guide

Get Metadb up and running in minutes

Installation

Build and configure Metadb for your environment

Core Concepts

Learn about data sources, table types, and transformations

CLI Reference

Explore all available commands

Get Started

Core Concepts

User Guide

Administration

Introduction

Metadb

What is Metadb?

Key Features

Continuous Synchronization

Historical Tracking

Data Transformation

PostgreSQL Compatible

User Workspaces

Access Control

How It Works

Use Cases

Architecture

Next Steps

Quickstart Guide

Installation

Core Concepts

CLI Reference

Build docs developers (and LLMs) love

Get Started

Core Concepts

User Guide

Administration

​Metadb

​What is Metadb?

​Key Features

Continuous Synchronization

Historical Tracking

Data Transformation

PostgreSQL Compatible

User Workspaces

Access Control

​How It Works

​Use Cases

​Architecture

​Next Steps

Quickstart Guide

Installation

Core Concepts

CLI Reference

Build docs developers (and LLMs) love

Metadb

What is Metadb?

Key Features

How It Works

Use Cases

Architecture

Next Steps