JSON Transformations
JSON transformations extract nested JSON objects and arrays into standard table columns, making the data easier to query and analyze.Why Transform JSON?
Consider querying JSON data without transformation:Transformations improve both query simplicity and performance by eliminating the need for JSON extraction functions.
Creating Data Mappings
Use thecreate data mapping command to define JSON transformations:
Identify the source table and column
Specify the main table (ending in
__) and the column containing JSON data.Define the JSON path
Use JSONPath syntax to identify which object or array to transform:
'$'= root object'$.metadata'= nested metadata object'$.tags.tagList'= array within tags object
Generated Tables
When you create a data mapping, Metadb generates two new tables:- Main transformed table:
table__{identifier}__(with history) - Current transformed table:
table__{identifier}(current records only)
patrongroup__ with identifier 't' creates:
patrongroup__t__(main transformed)patrongroup__t(current transformed)
Nested JSON Transformations
For nested JSON structures, create multiple mappings from the root down to nested elements:An object or array will only be transformed if all its parent paths are also transformed. You must create mappings for
'$' and '$.tags' before '$.tags.tagList' will be processed.Transformation Hierarchy
Visual representation of the example above:Dynamic Schema Inference
JSON transformations are schemaless - columns are inferred from the actual data rather than a predefined schema.- How It Works
- Benefits
- Considerations
- Metadb scans JSON records as they stream in
- When a field is found, a column is created for it
- New fields discovered later automatically add new columns
- No manual schema definition required
Example: Complete JSON Transformation
Let’s walk through a complete example:Source Data
__id | __start | id | jsondata |
|---|---|---|---|
| 8 | 2022-04-18 19:27:18-00 | 15 | {"groupname": "undergrad", "description": "Undergraduate Student"} |
| 4 | 2022-04-17 17:42:25-00 | 10 | {"groupname": "graduate", "description": "Graduate Student"} |
Create the Mapping
Wait for Next Update
The transformation takes effect when JSON data are next updated by the streaming process. You don’t need to manually trigger it.
Query the Transformed Table
__id | __start | id | groupname | description |
|---|---|---|---|---|
| 2 | 2022-04-18 19:27:18-00 | 15 | undergrad | Undergraduate Student |
| 5 | 2022-04-17 17:42:25-00 | 10 | graduate | Graduate Student |
Managing Data Mappings
List Configured Mappings
Remove a Mapping
Dropping a data mapping removes the configuration but does not delete the transformed tables. The tables will no longer be updated with new transformations.
External SQL Transformations
Metadb can schedule external SQL files to run on a regular basis for custom transformations and derived tables.Configuration
External SQL is enabled automatically for certain modules (like “folio” or “reshare”) when specified in the data source:folio-analytics repository at the specified Git reference.
SQL File Format
External SQL files should follow these guidelines:Add the table directive
Start with
--metadb:table <table_name> to declare which table this SQL updates.Omit schema names
Don’t specify a schema name in
create table. Metadb will place it in the appropriate location.Monitoring External SQL
Check when external SQL tables were last updated:Data Type Conversions
Metadb automatically converts data types in several scenarios:Automatic Conversions (Record-at-a-Time)
These conversions happen as records stream:| From Type | To numeric | To uuid | To jsonb | To text |
|---|---|---|---|---|
| boolean | ✅ | |||
| smallint | ✅ | ✅ | ||
| integer | ✅ | ✅ | ||
| bigint | ✅ | ✅ | ||
| real | ✅ | ✅ | ||
| double precision | ✅ | ✅ | ||
| numeric | ✅ | |||
| date/time types | ✅ | |||
| uuid | ✅ | |||
| json | ✅ | ✅ |
Inferred Conversions
Metadb can inferuuid type from text or varchar columns:
Type inference analyzes existing data to safely convert columns to more specific types like UUID.
User-Defined Transformations
Create custom transformation logic using database functions in your user workspace:Parameterized Reports
Package transformations as functions for reusable reports:Best Practices
Transform Frequently-Queried JSON
Create data mappings for JSON fields that are regularly used in queries and reports.
Use Descriptive Identifiers
Choose meaningful target identifiers like
'metadata' instead of generic ones like 'm'.Map Complete Hierarchies
When transforming nested JSON, map all parent levels from root down to the target.
Encapsulate Logic in Functions
Use database functions instead of views to package transformation logic and complex queries.
Monitor Transformation Updates
Regularly check
metadb.table_update to ensure transformed tables are updating as expected.Test Before Production
Test external SQL and complex transformations in a user workspace before deploying to production schemas.
