Overview
Meridian provides powerful utilities for creating and managing DuckDB tables from various data sources. The schema management system automatically infers data types, handles nested structures, and integrates with cloud storage.Table Creation Methods
Create Table from CSV
Import CSV files from cloud storage (Cloudflare R2) and automatically create DuckDB tables with inferred schemas.Function Signature
csvUrl(string): URL to the CSV file (e.g., Cloudflare R2 URL)tableName(string): Desired table name (will be sanitized)
Response
Usage Example
Implementation Details
Source:src/utils/duckdb.ts:80
The function:
- Fetches CSV content from the provided URL
- Writes content to a temporary file in
/tmp - Uses DuckDB’s
read_csv_auto()for automatic schema detection - Creates the table and imports all data
- Cleans up temporary files automatically
CSV Schema Detection
DuckDB’sread_csv_auto() automatically detects:
- Column names from header row
- Data types (INTEGER, DOUBLE, VARCHAR, DATE, etc.)
- Delimiter characters (comma, tab, pipe, etc.)
- Quote characters and escape sequences
- Null value representations
Create Table from JSON
Import JSON data directly into DuckDB with automatic type inference and nested structure flattening.Function Signature
data(array): Array of JSON objects to importtableName(string): Desired table name (will be sanitized)
Response
Usage Example
Nested JSON Flattening
Source:src/utils/duckdb.ts:281
The function automatically flattens nested JSON structures:
Input:
Type Inference
Automatic Type Detection
Source:src/utils/duckdb.ts:201
The system automatically infers DuckDB types from JavaScript values:
| JavaScript Type | DuckDB Type | Notes |
|---|---|---|
boolean | BOOLEAN | Direct mapping |
number (integer) | BIGINT | For whole numbers |
number (float) | DOUBLE | For decimal numbers |
string | VARCHAR | Default text type |
Date | DATE | Date objects |
object | VARCHAR | Stored as JSON string |
null / undefined | VARCHAR | Default for nullable columns |
Type Determination Algorithm
Source:src/utils/duckdb.ts:229
For each column, the system:
- Collects all non-null values
- Checks if all values are the same JavaScript type
- If mixed types, defaults to
VARCHAR - For consistent types, infers the most appropriate DuckDB type
- Objects and arrays are stored as JSON strings in
VARCHARcolumns
Table Name Sanitization
Source:src/utils/duckdb.ts:104, src/utils/duckdb.ts:380
Table names are automatically sanitized to ensure SQL compatibility:
Sanitization Rules
- Only alphanumeric characters and underscores are allowed
- All other characters are replaced with
_ - Pattern:
/[^a-zA-Z0-9_]/g
Column Name Sanitization
Source:src/utils/duckdb.ts:255
Column names are sanitized using similar rules:
Sanitization Rules
- Invalid characters replaced with underscore
- Column names starting with numbers get
_prefix - Pattern:
/[^a-zA-Z0-9_]/gand/^[0-9]/
SQL Value Escaping
Source:src/utils/duckdb.ts:261
The system safely escapes values for SQL insertion:
Boolean Values
Numeric Values
String Values
Null Values
Object Values
Error Handling
Common Errors
Empty Data Array
No Columns Found
CSV Fetch Failure
Cleanup and Resource Management
Source:src/utils/duckdb.ts:138
Temporary files are automatically cleaned up:
Table Operations
Drop and Recreate
Both functions automatically drop existing tables before creation:- Fresh schema on each import
- No conflicts with existing tables
- Clean data without duplicates
Row Count Verification
After table creation, the system verifies the import:- All data was imported successfully
- Table is queryable
- No errors during import
Integration with MotherDuck
Source:src/utils/duckdb.ts:18
All table operations use the shared DuckDB instance:
Connection String
md:- MotherDuck protocolmy_db- Database nametoken- Encoded access token fromMD_ACCESS_TOKENenvironment variable
Environment Setup
Source:src/utils/duckdb.ts:14
Best Practices
CSV Imports
- Use stable, publicly accessible URLs
- Ensure CSV files have header rows
- Validate CSV structure before import
- Monitor file sizes for performance
JSON Imports
- Validate JSON structure before import
- Be aware of nested array expansion
- Consider pre-flattening complex structures
- Monitor memory usage for large datasets
Schema Design
- Use descriptive table names
- Avoid special characters in column names
- Consider type casting for mixed-type columns
- Plan for null values in your schema
Performance
- Batch multiple inserts when possible
- Create indexes on frequently queried columns
- Use appropriate data types to minimize storage
- Consider partitioning for very large tables